{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ScitaPqhKtuW"
      },
      "source": [
        "##### Copyright 2022 The TensorFlow GNN Authors.\n",
        "\n",
        "Licensed under the Apache License, Version 2.0 (the \"License\");"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hMqWDc_m6rUC"
      },
      "outputs": [],
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, eicther express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "udvGTpefWRE_"
      },
      "source": [
        "# An in-depth look at TF-GNN for OGBN-MAG\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ev9vJpM94c3i"
      },
      "source": [
        "\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "  \u003ctd\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/gnn/blob/master/examples/notebooks/ogbn_mag_indepth.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "  \u003ctd\u003e\n",
        "    \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/gnn/blob/main/examples/notebooks/ogbn_mag_indepth.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView on GitHub\u003c/a\u003e\n",
        "  \u003c/td\u003e\n",
        "\u003c/table\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rEvXnZOrWRC2"
      },
      "source": [
        "### Abstract\n",
        "\n",
        "This tutorial solves the same task as the basic \"[OGBN-MAG end-to-end](https://colab.research.google.com/github/tensorflow/gnn/blob/master/examples/notebooks/ogbn_mag_e2e.ipynb)\" tutorial, but at a lower API level: it introduces to you to the `tfgnn.GraphTensor` type, uses it to build a Graph Neural Network in Keras, and demonstrates how to train the network without the abstractions provided by the TF-GNN Runner.\n",
        "\n",
        "This tutorial is intended for advanced users who have a working knowledge of TF2/Keras already and need to exercise greater control over the model definition and the training code."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TkPEzhxOV_XF"
      },
      "source": [
        "## Colab set-up"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "pURiJKTuYR0D"
      },
      "outputs": [],
      "source": [
        "!pip install -q tensorflow-gnn || echo \"Ignoring package errors...\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 6665,
          "status": "ok",
          "timestamp": 1711613646349,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "6wht6mjUuZeA",
        "outputId": "8ee4cfaa-88f5-4cf6-a1da-00346bbd1857"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Running TF-GNN 1.0.2 under TensorFlow 2.15.0.\n"
          ]
        }
      ],
      "source": [
        "import functools\n",
        "import itertools\n",
        "import os\n",
        "import random\n",
        "import re\n",
        "os.environ[\"TF_USE_LEGACY_KERAS\"] = \"1\"  # For TF2.16+.\n",
        "\n",
        "from google.protobuf import text_format\n",
        "import tensorflow as tf\n",
        "import tensorflow_gnn as tfgnn\n",
        "\n",
        "print(f\"Running TF-GNN {tfgnn.__version__} under TensorFlow {tf.__version__}.\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-WzKtCIdys-I"
      },
      "source": [
        "## Introduction"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "B0Thnw2VYjsi"
      },
      "source": [
        "### Problem statement and dataset\n",
        "\n",
        "OGBN-MAG is [Open Graph Benchmark](https://ogb.stanford.edu)'s Node classification task on a subset of the [Microsoft Academic Graph](https://www.microsoft.com/en-us/research/publication/microsoft-academic-graph-when-experts-are-not-enough/). The [basic tutorial](https://colab.research.google.com/github/tensorflow/gnn/blob/master/examples/notebooks/ogbn_mag_e2e.ipynb) has explained the dataset and the task in detail. To recap, the OGBN-MAG dataset is one big graph with node sets \"paper\", \"field_of_study\", \"author\", and \"institution\" and edge sets \"cites\", \"has_topic\", \"writes\", and \"affiliated_with\", each connecting two particular node sets.\n",
        "\n",
        "The task is to **predict the venue** (journal or conference, not represented in the graph itself) at which each of the papers has been published. Based on their \"year\" feature, the \"paper\" nodes are split into \"train\" (`year\u003c=2017`), \"validation\" (`year==2018`), and \"test\"(`year==2019`).\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fGC9V4AZfXhs"
      },
      "source": [
        "### Approach\n",
        "\n",
        "To stay scalable for even bigger datasets, we approach this task with **graph sampling**: Each \"paper\" node becomes one training example, expressed by a subgraph that has the node to be classified as its root and stores a sample of its neighborhood in the original graph. The sample is taken by going out a fixed number of steps along specific edge sets, and randomly downsampling the edges in each step if they are too numerous.\n",
        "\n",
        "The actual **TensorFlow model** runs on batches of these sampled subgraphs, applies a Graph Neural Network to propagate information from related nodes towards the root node of each batch, and then applies a softmax classifier to predict one of 349 classes (each venue is a class).\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XKDrbUvvyx4u"
      },
      "source": [
        "## Data preparation and graph sampling\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JahYwcbcU1AU"
      },
      "source": [
        "Data preparation and sampling are exactly identical to the [basic tutorial](https://colab.research.google.com/github/tensorflow/gnn/blob/master/examples/notebooks/ogbn_mag_e2e.ipynb) and not repeated here. To recap, the sampling expected by the model in this colab proceeds as follows:\n",
        "\n",
        "  1. Start from all \"paper\" nodes.\n",
        "  2. For each paper from 1, follow a random sample of \"cites\" edges to other \"paper\" nodes.\n",
        "  3. For each paper from 1 or 2, follow a random sample of reversed \"writes\" edges to \"author\" nodes and store them as edge set \"written\".\n",
        "  4. For each author from 3, follow a random sample of \"writes\" edges to more \"paper\" nodes.\n",
        "  5. For each author from 3, follow a random sample of \"affiliated_with\" edges to \"institution\" nodes.\n",
        "  6. For each paper from 1, 2 or 4, follow a random sample of \"has_topic\" edges to \"field_of_study\" nodes.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0OxXoNBDTQLa"
      },
      "source": [
        "## Reading the sampled subgraphs"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1OhCfOC9TjGD"
      },
      "source": [
        "The result of sampling is available here (subject to this [license](https://storage.googleapis.com/download.tensorflow.org/data/ogbn-mag/sampled/v1/edge/LICENSE.txt)):\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LzTncFRFZFco"
      },
      "outputs": [],
      "source": [
        "input_file_pattern = \"gs://download.tensorflow.org/data/ogbn-mag/sampled/v1/edge/samples-?????-of-00100\"\n",
        "graph_schema_file = \"gs://download.tensorflow.org/data/ogbn-mag/sampled/v1/edge/schema.pbtxt\"\n",
        "graph_schema = tfgnn.read_schema(graph_schema_file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yp8JQF-tYRVP"
      },
      "source": [
        "TF-GNN's guide on [Describing your Graph](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/schema.md) explains what a graph schema is. Here is the one for this dataset of sampled subgraphs:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 5,
          "status": "ok",
          "timestamp": 1711613646688,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "j3aOVEReYOLO",
        "outputId": "f2976ee0-89ed-4ff2-8b7b-1f23506b2990"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "context {\n",
              "  features {\n",
              "    key: \"seed_id\"\n",
              "    value {\n",
              "      dtype: DT_STRING\n",
              "    }\n",
              "  }\n",
              "  features {\n",
              "    key: \"sample_id\"\n",
              "    value {\n",
              "      dtype: DT_STRING\n",
              "    }\n",
              "  }\n",
              "  metadata {\n",
              "  }\n",
              "}\n",
              "node_sets {\n",
              "  key: \"paper\"\n",
              "  value {\n",
              "    features {\n",
              "      key: \"year\"\n",
              "      value {\n",
              "        dtype: DT_INT64\n",
              "        shape {\n",
              "          dim {\n",
              "            size: 1\n",
              "          }\n",
              "        }\n",
              "      }\n",
              "    }\n",
              "    features {\n",
              "      key: \"labels\"\n",
              "      value {\n",
              "        dtype: DT_INT64\n",
              "        shape {\n",
              "          dim {\n",
              "            size: 1\n",
              "          }\n",
              "        }\n",
              "      }\n",
              "    }\n",
              "    features {\n",
              "      key: \"feat\"\n",
              "      value {\n",
              "        dtype: DT_FLOAT\n",
              "        shape {\n",
              "          dim {\n",
              "            size: 128\n",
              "          }\n",
              "        }\n",
              "      }\n",
              "    }\n",
              "    features {\n",
              "      key: \"#id\"\n",
              "      value {\n",
              "        dtype: DT_STRING\n",
              "      }\n",
              "    }\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "node_sets {\n",
              "  key: \"institution\"\n",
              "  value {\n",
              "    features {\n",
              "      key: \"#id\"\n",
              "      value {\n",
              "        dtype: DT_STRING\n",
              "      }\n",
              "    }\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "node_sets {\n",
              "  key: \"field_of_study\"\n",
              "  value {\n",
              "    features {\n",
              "      key: \"#id\"\n",
              "      value {\n",
              "        dtype: DT_STRING\n",
              "      }\n",
              "    }\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "node_sets {\n",
              "  key: \"author\"\n",
              "  value {\n",
              "    features {\n",
              "      key: \"#id\"\n",
              "      value {\n",
              "        dtype: DT_STRING\n",
              "      }\n",
              "    }\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "edge_sets {\n",
              "  key: \"written\"\n",
              "  value {\n",
              "    source: \"paper\"\n",
              "    target: \"author\"\n",
              "    metadata {\n",
              "      extra {\n",
              "        key: \"edge_type\"\n",
              "        value: \"reversed\"\n",
              "      }\n",
              "    }\n",
              "  }\n",
              "}\n",
              "edge_sets {\n",
              "  key: \"writes\"\n",
              "  value {\n",
              "    source: \"author\"\n",
              "    target: \"paper\"\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "edge_sets {\n",
              "  key: \"has_topic\"\n",
              "  value {\n",
              "    source: \"paper\"\n",
              "    target: \"field_of_study\"\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "edge_sets {\n",
              "  key: \"cites\"\n",
              "  value {\n",
              "    source: \"paper\"\n",
              "    target: \"paper\"\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}\n",
              "edge_sets {\n",
              "  key: \"affiliated_with\"\n",
              "  value {\n",
              "    source: \"author\"\n",
              "    target: \"institution\"\n",
              "    metadata {\n",
              "    }\n",
              "  }\n",
              "}"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "graph_schema"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WeBmhJANUv6r"
      },
      "source": [
        "Training a neural network with stochastic gradient descent requires randomly shuffled training data, but ours is too big to fully reshuffle it on the fly while reading. Fortunately, the graph sampler tool has already reshuffled its outputs before writing to a sharded TFRecord file.\n",
        "\n",
        "For speed, we want to read from several shards in parallel.\n",
        "For distributed training, each trainer replica reads from its own subset of the shards. To achieve some randomization between training runs, each replica reshuffles the order of input shards and then shuffles examples within a moderate window.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Jo5O8Kz7KAFV"
      },
      "outputs": [],
      "source": [
        "def _get_dataset(file_pattern, *, shuffle=False, filter_fn=None,\n",
        "                 input_context=None):\n",
        "  # For your own file system or GCS bucket, call the usual helper\n",
        "  # filenames = tf.io.gfile.glob(file_pattern)\n",
        "  # For gs://download.tensorflow.org, we avoid listing it and do\n",
        "  filenames = _glob_sharded(file_pattern)\n",
        "  ds = tf.data.Dataset.from_tensor_slices(filenames)\n",
        "  if input_context and input_context.num_input_pipelines \u003e 1:\n",
        "    ds = ds.shard(input_context.num_input_pipelines,\n",
        "                  input_context.input_pipeline_id)\n",
        "  if shuffle:\n",
        "    ds = ds.shuffle(len(filenames))\n",
        "\n",
        "  def interleave_fn(filename):\n",
        "    ds = tf.data.TFRecordDataset(filename)\n",
        "    if filter_fn is not None:\n",
        "      ds = ds.filter(filter_fn)\n",
        "    return ds\n",
        "  ds = ds.interleave(\n",
        "      interleave_fn, cycle_length=10,\n",
        "      deterministic=False, num_parallel_calls=tf.data.AUTOTUNE)\n",
        "  if shuffle:\n",
        "    ds = ds.shuffle(10000)\n",
        "  ds = ds.prefetch(tf.data.AUTOTUNE)\n",
        "  return ds\n",
        "\n",
        "def _glob_sharded(file_pattern):\n",
        "  match = re.fullmatch(r\"(.*)-\\?\\?\\?\\?\\?-of-(\\d\\d\\d\\d\\d)\", file_pattern)\n",
        "  if match is None:  # No shard suffix found.\n",
        "    return [file_pattern]\n",
        "  basename = match[1]\n",
        "  n = int(match[2])\n",
        "  return [f\"{basename}-{i:05d}-of-{n:05d}\" for i in range(n)]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sY6-Y3d9I9Sc"
      },
      "source": [
        "Graph sampling stores each sampled subgraph in its output as one tf.Example proto, with structured feature names, as explained in the [Data Preparation](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/data_prep.md) guide. If you're curious, take a look here:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qXkkWpWLKtdv"
      },
      "outputs": [],
      "source": [
        "demo_ds = _get_dataset(input_file_pattern.replace(\"-?????-of-00100\", \"-00000-of-00100\"))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 775,
          "status": "ok",
          "timestamp": 1711613648672,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "USZYLqb7J8CI",
        "outputId": "ca58028d-aa85-4be9-c312-f9dffce0a5d5"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "features {\n",
            "  feature {\n",
            "    key: \"context/sample_id\"\n",
            "    value { bytes_list { value: \"paper109537\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"context/seed_id\"\n",
            "    value { bytes_list { value: \"paper109537\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/affiliated_with.#size\"\n",
            "    value { int64_list { value: [6] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/affiliated_with.#source\"\n",
            "    value { int64_list { value: [0, 1, 2, 3, 4, 4] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/affiliated_with.#target\"\n",
            "    value { int64_list { value: [0, 0, 1, 0, 1, 2] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/cites.#size\"\n",
            "    value { int64_list { value: [1] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/cites.#source\"\n",
            "    value { int64_list { value: [0] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/cites.#target\"\n",
            "    value { int64_list { value: [20] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/has_topic.#size\"\n",
            "    value { int64_list { value: [307] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/has_topic.#source\"\n",
            "    value { int64_list { value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/has_topic.#target\"\n",
            "    value { int64_list { value: [7, 10, 21, 22, 49, 67, 94, 96, 109, 121, 20, 21, 22, 25, 59, 69, 112, 115, 122, 129, 9, 21, 22, 31, 45, 60, 61, 62, 72, 124, 9, 21, 22, 40, 43, 63, 82, 104, 123, 127, 1, 21, 22, 39, 48, 49, 94, 95, 98, 102, 115, 2, 5, 21, 22, 36, 41, 52, 57, 64, 78, 98, 6, 11, 21, 22, 30, 52, 53, 72, 94, 118, 15, 22, 24, 37, 39, 66, 91, 108, 114, 120, 13, 22, 54, 55, 68, 94, 95, 107, 112, 116, 7, 14, 19, 21, 22, 37, 38, 47, 72, 100, 106, 5, 21, 22, 26, 39, 55, 64, 72, 87, 111, 128, 7, 16, 21, 22, 35, 49, 51, 58, 72, 84, 100, 21, 22, 48, 72, 89, 90, 95, 100, 114, 3, 8, 17, 18, 21, 22, 33, 48, 72, 74, 77, 98, 0, 22, 27, 28, 29, 55, 65, 72, 88, 105, 114, 10, 21, 22, 49, 58, 67, 72, 86, 94, 100, 121, 11, 21, 22, 34, 46, 52, 53, 76, 98, 110, 122, 7, 12, 21, 22, 49, 58, 67, 71, 94, 100, 115, 4, 21, 22, 48, 67, 72, 79, 83, 94, 100, 129, 21, 22, 49, 51, 58, 67, 72, 94, 100, 118, 121, 21, 22, 23, 50, 52, 53, 72, 84, 99, 100, 118, 9, 21, 22, 38, 46, 51, 72, 75, 83, 100, 125, 22, 44, 55, 73, 93, 94, 95, 101, 117, 22, 32, 55, 72, 92, 102, 103, 113, 22, 29, 55, 81, 105, 128, 21, 22, 35, 42, 46, 49, 94, 97, 100, 122, 126, 0, 22, 29, 55, 70, 114, 128, 11, 21, 22, 34, 50, 52, 53, 56, 85, 123, 7, 21, 22, 35, 49, 67, 80, 94, 100, 112, 121, 7, 21, 22, 35, 49, 58, 67, 94, 98, 119, 126] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/writes.#size\"\n",
            "    value { int64_list { value: [42] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/writes.#source\"\n",
            "    value { int64_list { value: [0, 0, 0, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/writes.#target\"\n",
            "    value { int64_list { value: [0, 9, 28, 0, 1, 3, 5, 6, 10, 11, 15, 16, 17, 18, 19, 20, 21, 25, 27, 29, 0, 2, 4, 7, 8, 9, 12, 13, 14, 22, 23, 24, 26, 28, 6, 11, 15, 16, 17, 19, 20, 29] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/written.#size\"\n",
            "    value { int64_list { value: [5] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/written.#source\"\n",
            "    value { int64_list { value: [0, 0, 0, 20, 20] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"edges/written.#target\"\n",
            "    value { int64_list { value: [0, 1, 3, 2, 4] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/author.#id\"\n",
            "    value { bytes_list { value: \"author350023\" value: \"author375102\" value: \"author40482\" value: \"author561083\" value: \"author65732\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/author.#size\"\n",
            "    value { int64_list { value: [5] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/field_of_study.#id\"\n",
            "    value { bytes_list { value: \"field_of_study10579\" value: \"field_of_study10717\" value: \"field_of_study10977\" value: \"field_of_study11025\" value: \"field_of_study11102\" value: \"field_of_study11185\" value: \"field_of_study11194\" value: \"field_of_study11491\" value: \"field_of_study11638\" value: \"field_of_study11746\" value: \"field_of_study11783\" value: \"field_of_study11860\" value: \"field_of_study12313\" value: \"field_of_study12401\" value: \"field_of_study12463\" value: \"field_of_study12872\" value: \"field_of_study1301\" value: \"field_of_study13335\" value: \"field_of_study13391\" value: \"field_of_study13498\" value: \"field_of_study13574\" value: \"field_of_study13979\" value: \"field_of_study14055\" value: \"field_of_study14071\" value: \"field_of_study14187\" value: \"field_of_study14243\" value: \"field_of_study14803\" value: \"field_of_study15197\" value: \"field_of_study15250\" value: \"field_of_study15255\" value: \"field_of_study15444\" value: \"field_of_study15487\" value: \"field_of_study15859\" value: \"field_of_study16475\" value: \"field_of_study16839\" value: \"field_of_study16870\" value: \"field_of_study16973\" value: \"field_of_study17257\" value: \"field_of_study17704\" value: \"field_of_study18403\" value: \"field_of_study18476\" value: \"field_of_study18738\" value: \"field_of_study18814\" value: \"field_of_study18818\" value: \"field_of_study18870\" value: \"field_of_study19052\" value: \"field_of_study19378\" value: \"field_of_study1945\" value: \"field_of_study19499\" value: \"field_of_study19591\" value: \"field_of_study19898\" value: \"field_of_study20473\" value: \"field_of_study20613\" value: \"field_of_study20727\" value: \"field_of_study20979\" value: \"field_of_study21326\" value: \"field_of_study21338\" value: \"field_of_study2194\" value: \"field_of_study21954\" value: \"field_of_study22043\" value: \"field_of_study22088\" value: \"field_of_study22415\" value: \"field_of_study22420\" value: \"field_of_study22453\" value: \"field_of_study22885\" value: \"field_of_study23068\" value: \"field_of_study23415\" value: \"field_of_study23651\" value: \"field_of_study23655\" value: \"field_of_study2373\" value: \"field_of_study23881\" value: \"field_of_study24073\" value: \"field_of_study24175\" value: \"field_of_study24242\" value: \"field_of_study26823\" value: \"field_of_study26890\" value: \"field_of_study2754\" value: \"field_of_study28034\" value: \"field_of_study28552\" value: \"field_of_study29963\" value: \"field_of_study30325\" value: \"field_of_study3097\" value: \"field_of_study31676\" value: \"field_of_study32868\" value: \"field_of_study34227\" value: \"field_of_study34635\" value: \"field_of_study3480\" value: \"field_of_study35058\" value: \"field_of_study38658\" value: \"field_of_study3951\" value: \"field_of_study43486\" value: \"field_of_study4370\" value: \"field_of_study44240\" value: \"field_of_study44839\" value: \"field_of_study4603\" value: \"field_of_study4861\" value: \"field_of_study4966\" value: \"field_of_study53291\" value: \"field_of_study5341\" value: \"field_of_study5697\" value: \"field_of_study5715\" value: \"field_of_study58195\" value: \"field_of_study5834\" value: \"field_of_study58526\" value: \"field_of_study613\" value: \"field_of_study6139\" value: \"field_of_study6439\" value: \"field_of_study6670\" value: \"field_of_study6794\" value: \"field_of_study6830\" value: \"field_of_study6849\" value: \"field_of_study703\" value: \"field_of_study718\" value: \"field_of_study7233\" value: \"field_of_study7276\" value: \"field_of_study8017\" value: \"field_of_study8209\" value: \"field_of_study8264\" value: \"field_of_study8504\" value: \"field_of_study8644\" value: \"field_of_study8723\" value: \"field_of_study8882\" value: \"field_of_study900\" value: \"field_of_study9074\" value: \"field_of_study9118\" value: \"field_of_study9142\" value: \"field_of_study9171\" value: \"field_of_study9489\" value: \"field_of_study9687\" value: \"field_of_study982\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/field_of_study.#size\"\n",
            "    value { int64_list { value: [130] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/institution.#id\"\n",
            "    value { bytes_list { value: \"institution4210\" value: \"institution4218\" value: \"institution7915\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/institution.#size\"\n",
            "    value { int64_list { value: [3] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/paper.#id\"\n",
            "    value { bytes_list { value: \"paper109537\" value: \"paper100622\" value: \"paper103508\" value: \"paper114805\" value: \"paper174522\" value: \"paper184235\" value: \"paper198716\" value: \"paper213137\" value: \"paper251709\" value: \"paper271713\" value: \"paper300386\" value: \"paper309500\" value: \"paper326446\" value: \"paper341929\" value: \"paper354707\" value: \"paper377948\" value: \"paper389587\" value: \"paper402984\" value: \"paper410065\" value: \"paper427777\" value: \"paper438743\" value: \"paper444296\" value: \"paper497129\" value: \"paper498832\" value: \"paper502918\" value: \"paper546534\" value: \"paper547723\" value: \"paper590736\" value: \"paper73465\" value: \"paper99862\" } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/paper.#size\"\n",
            "    value { int64_list { value: [30] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/paper.feat\"\n",
            "    value { float_list { value: [-0.023612, -0.160456, -0.079695, -0.200688, -0.039802, -0.186624, -0.393058, -0.265741, -0.019301, 0.188298, -0.09662, -0.337132, 0.083665, 0.043082, -0.186145, 0.03146, 0.15045, 0.429943, 0.08309, -0.354901, -0.199495, 0.074126, -0.383941, 0.122103, -0.224577, 0.415893, 0.086517, -0.387551, -0.069178, 0.111193, -0.10507, -0.01564, 0.074341, 0.071838, 0.068195, -0.454012, -0.094935, 0.025181, 0.319998, 0.238552, 0.132143, 0.043759, 0.266422, -0.072535, 0.028388, 0.259473, 0.159389, -0.016864, 0.165084, 0.071008, 0.479412, 0.116796, 0.055964, -0.261519, -0.112707, -0.060209, -0.208906, -0.039663, 0.188145, 0.214914, 0.149292, -0.060365, 0.051407, -0.134362, -0.067802, 0.176398, -0.257404, 0.149808, 0.480471, -0.17417, 0.054553, -0.029096, 0.206278, -0.635425, -0.075961, -0.218011, -0.045135, -0.020691, -0.03409, 0.031484, -0.125503, -0.025567, 0.538609, -0.313348, 0.067667, 0.223802, -0.425026, 0.111851, 0.892871, 0.073596, 0.297304, 0.031548, 0.099789, -0.072524, -0.006824, 0.167282, -0.144428, -0.095943, 0.193722, 0.013868, -0.253282, 0.754854, -0.252026, -0.199099, 0.298224, 0.097633, 0.074745, -0.013437, -0.058675, -0.065993, -0.206768, -0.010417, 0.027841, 0.118774, 0.449678, 0.119096, 0.2538, -0.05922, 0.012538, -0.143442, -0.353478, -0.001671, 0.242266, -0.434383, 0.01699, 0.132356, 0.107107, -0.152027, 0.031064, -0.026024, 0.163663, -0.169563, 0.172117, -0.273783, -0.462646, -0.359186, 0.066255, 0.129994, 0.019728, -0.587513, 0.08853, 0.047923, -0.220252, -0.037241, 0.043926, 0.310875, 0.13327, -0.338798, -0.231445, -5.6e-05, -0.560686, 0.165907, -0.233884, 0.39804, 0.105665, -0.395882, -0.160893, -0.038946, 0.08274, -0.090329, 0.055085, -0.1463, -0.140147, -0.40237, -0.241125, 0.109866, 0.364005, 0.065141, 0.176239, 0.124118, 0.119634, -0.057752, -0.00027, 0.259153, 0.156739, -0.049803, 0.148933, 0.023467, 0.204584, 0.208405, 0.124386, -0.179836, -0.13028, -0.133096, -0.18458, -0.174969, -0.093089, -0.042543, 0.419997, -0.007037, -0.084097, 0.004137, 0.134543, 0.18552, -0.309414, 0.056662, 0.365543, -0.163808, -0.06509, -0.152126, 0.03505, -0.450886, -0.1641, -0.113431, 0.002695, -0.140312, 0.019497, 0.147848, -0.212811, 0.04876, 0.545718, -0.217269, 0.147605, -0.056059, -0.207038, 0.024997, 0.881259, 0.049938, 0.327271, 0.093776, 0.122453, -0.141597, -0.008778, -0.012949, -0.097073, 0.114515, 0.210782, -0.029558, -0.28561, 0.794976, -0.378596, -0.112373, 0.285187, 0.022265, -0.062885, -0.047391, -0.145599, 0.020778, -0.378084, -0.054039, 0.064026, 0.18702, 0.526464, 0.238512, 0.022973, 0.069692, 0.129879, -0.139416, -0.213497, -0.142297, 0.274098, -0.282853, -0.165528, -0.102288, -0.064284, -0.2127, -0.10764, 0.022668, -0.047734, -0.117648, 0.091242, -0.108568, -0.488375, -0.378838, 0.20532, 0.053701, -0.046847, -0.433891, 0.083637, 0.112754, -0.036488, 0.055397, -0.018075, 0.42977, 0.06207, -0.228518, -0.166628, 0.242778, -0.439153, 0.038826, 0.010477, 0.445581, -0.033247, -0.28235, -0.126575, -0.02966, -0.159199, -0.139209, 0.060619, 0.042433, -0.070436, -0.28267, -0.177702, 0.013145, 0.004166, 0.203204, 0.265538, 0.020372, 0.118493, -0.013599, -0.009071, 0.216104, 0.272827, -0.102323, 0.220981, 0.191355, 0.272123, 0.192353, -0.181225, -0.190144, -0.173458, 0.051854, -0.08479, -0.242894, -0.06399, -0.001597, 0.248056, -0.027627, 0.093402, -0.024776, 0.143236, 0.381895, -0.101665, -0.001778, 0.416439, -0.220747, -0.064488, 0.116206, 0.096249, -0.343981, -0.03703, -0.089897, 0.125928, 0.2276, 0.02002, 0.081945, -0.176213, -0.139119, 0.611946, -0.271991, 0.168924, 0.121871, -0.271408, 0.063022, 0.955892, 0.038299, 0.223218, -0.034279, 0.111191, -0.196455, 0.112165, -0.121324, -0.061302, 0.020067, 0.135556, 0.016581, -0.114695, 0.75208, -0.29743, -0.172138, 0.172383, 0.169215, -0.08554, 0.093911, 0.096008, -0.007381, -0.347133, -0.1414, 0.129574, 0.140097, 0.364185, 0.155982, 0.023677, -0.037611, -0.045452, -0.125929, -0.243866, -0.161624, 0.253123, -0.284346, -0.074818, 0.133354, -0.045286, -0.104689, -0.136773, -0.230187, -0.007537, -0.117128, 0.134471, -0.019646, -0.613208, -0.224676, 0.063981, -0.010744, -0.055194, -0.475098, 0.045877, 0.00075, -0.167013, 0.016099, 0.130443, 0.372843, 0.101278, -0.387778, -0.381355, -0.05776, -0.475087, 0.10115, -0.379671, 0.596031, 0.026466, -0.541794, -0.141728, -0.057096, 0.113767, 0.07067, 0.024044, -0.070595, -0.056578, -0.41572, -0.294507, 0.210867, 0.23179, 0.4726, 0.179213, 0.077867, 0.394006, 0.068947, -0.021282, 0.335376, 0.424465, -0.062276, 0.238816, -0.036471, 0.439514, 0.046056, -0.023528, -0.261879, -0.255005, 0.003089, -0.039062, -0.151343, 0.032452, 0.008917, 0.176692, 0.018272, 0.072328, 0.013383, 0.210338, 0.274891, -0.315423, 0.174903, 0.572373, -0.338776, 0.0532, -0.196881, 0.092051, -0.655329, -0.104671, -0.121091, 0.098388, -0.020181, -0.214819, 0.130772, -0.179492, -0.050124, 0.481791, -0.442479, 0.171065, 0.02857, -0.334063, 0.044422, 0.825984, 0.006274, 0.37759, -0.019988, 0.258655, -0.014406, -0.1715, -0.014165, -0.145223, 0.024863, 0.058378, -0.136334, -0.312202, 0.761384, -0.150581, -0.16736, 0.014856, 0.2753, 0.020729, -0.060043, -0.002411, 0.144197, -0.377392, -0.16434, -0.003597, 0.08675, 0.573763, 0.172209, 0.196374, 0.101451, 0.069434, -0.163197, -0.200953, -0.200704, 0.31644, -0.471659, -0.237379, 0.039656, -0.05737, -0.167327, -0.216035, 0.05991, -0.210756, -0.039138, 0.184042, -0.053501, -0.481492, -0.187115, 0.163449, -0.060813, -0.035855, -0.381499, 0.19751, -0.115917, -0.292694, 0.08659, -0.063236, 0.391759, 0.091275, -0.382785, -0.385523, -0.089556, -0.466824, 0.114619, -0.05926, 0.477741, 0.228687, -0.271135, -0.05119, 0.023081, -0.145122, -0.020666, 0.155285, -0.062706, -0.046717, -0.318168, -0.126797, 0.226243, 0.378113, 0.308695, 0.3599, -0.051537, 0.092385, -0.024884, -0.108543, 0.138051, 0.239878, -0.035593, 0.284443, 0.072225, 0.232531, 0.170145, -0.125163, -0.314875, -0.228649, 0.062364, 0.041351, -0.012116, -0.089544, -0.072586, 0.322693, 0.03053, 0.050689, -0.123864, 0.172268, 0.026618, -0.244321, -0.016073, 0.360905, -0.188887, -0.056364, -0.006391, -0.022005, -0.428517, -0.246242, -0.275403, 0.203644, -0.178349, 0.134617, 0.222569, -0.231599, 0.286505, 0.605118, -0.336434, 0.268108, 0.022954, -0.526117, 0.252621, 0.881393, 0.042748, 0.382947, 0.095375, 0.115646, -0.147129, -0.15367, -0.053104, -0.164101, -0.044683, -0.034566, 0.011599, -0.044125, 0.76456, -0.230179, -0.147529, 0.041751, 0.253125, 0.145187, 0.104921, -0.008981, 0.062023, -0.32542, 0.020716, 0.217761, 0.27337, 0.312061, -0.073233, 0.208035, 0.100153, -0.117976, -0.03062, -0.054681, -0.076717, 0.285473, -0.180537, -0.11225, 0.052172, -0.148613, 0.068769, -0.069517, -0.112262, -0.052079, -0.066963, 0.097725, -0.103519, -0.538853, -0.249638, 0.057994, 0.081499, -0.062273, -0.429797, 0.106344, 0.070537, -0.254846, 0.039568, -0.019523, 0.427951, 0.189263, -0.355043, -0.253951, 0.110615, -0.385318, -0.010805, -0.132825, 0.434274, 0.042914, -0.372746, -0.138973, 0.051665, 0.028602, -0.018378, 0.074011, 0.054075, -0.052641, -0.310743, -0.143968, 0.179804, 0.138117, 0.212902, 0.146248, -0.03204, 0.20205, 0.090548, 0.045995, 0.149961, 0.17566, -0.014283, 0.266395, 0.131578, 0.328263, 0.151834, -0.03462, -0.207432, -0.143221, 0.07351, -0.109655, -0.144067, -0.010369, 0.04138, 0.201038, -0.055665, 0.087338, -0.036377, 0.079275, 0.252243, -0.202426, 0.039898, 0.391168, -0.194394, -0.033951, -0.108843, 0.209739, -0.598404, -0.148418, -0.27224, 0.067901, 0.034415, -0.031686, 0.076217, -0.163464, -0.050671, 0.545246, -0.28062, 0.077034, 0.090574, -0.250305, 0.0398, 0.842245, 0.201539, 0.453935, -0.050538, 0.16164, -0.061114, -0.065615, -0.052597, 0.016354, 0.129377, 0.087913, -0.131976, -0.259575, 0.749094, -0.358549, -0.176334, 0.256943, 0.098784, -0.011105, -0.019221, 0.049502, 0.081099, -0.257011, -0.18266, 0.027669, 0.053138, 0.374612, 0.20584, 0.130798, 0.012089, 0.02588, -0.147214, -0.237375, -0.08577, 0.279289, -0.242454, -0.113756, 0.083134, -0.075804, -0.272919, -0.099469, -0.133361, -0.06801, -0.158642, 0.061495, -0.113106, -0.476282, -0.162667, 0.040342, 0.211285, -0.107319, -0.435225, 0.058627, -0.014326, -0.229667, 0.063165, -0.059862, 0.352407, 0.022295, -0.410403, -0.293708, 0.125953, -0.38074, 0.004017, -0.109045, 0.468037, 0.094957, -0.300311, -0.128789, 0.135709, -0.199484, -0.082375, 0.090305, 0.041727, 0.028702, -0.375535, -0.223765, 0.233153, 0.360264, 0.369755, 0.186834, 0.040423, 0.232095, 0.083578, -0.015363, 0.091965, 0.239196, 0.052808, 0.205955, 0.138641, 0.316803, 0.037324, 0.045225, -0.19794, -0.228802, 0.003037, -0.144712, -0.018923, 0.094568, 0.166144, 0.140159, -0.128074, 0.040282, -0.132352, 0.156764, 0.239064, -0.306725, 0.024175, 0.379893, -0.13879, -0.012084, -0.038086, 0.132375, -0.567848, -0.028699, -0.333312, -0.033433, 0.048922, -0.039048, 0.13802, -0.185314, 0.017369, 0.479948, -0.280342, 0.170499, 0.168849, -0.362925, 0.117506, 0.875114, 0.030164, 0.258702, 0.073058, 0.15773, -0.116977, -0.080386, 0.056956, -0.088061, -0.011181, 0.129275, 0.002074, -0.296906, 0.734068, -0.281536, -0.1764, 0.170539, 0.181595, 0.04638, 0.021234, -0.024506, -0.004854, -0.233706, -0.047784, -0.035932, 0.154272, 0.391234, 0.112677, 0.17516, 0.008591, -0.030649, -0.236768, -0.279215, -0.134645, 0.228704, -0.211726, -0.043668, 0.024508, 0.021091, -0.201145, -0.180338, -0.086563, -0.215319, 0.004892, -0.047731, -0.177971, -0.402607, -0.158358, -0.23722, 0.245632, 0.015266, -0.668589, 0.11963, 0.120173, -0.1465, -0.017069, -0.147848, 0.395046, 0.127071, -0.392078, -0.389421, 0.034929, -0.667332, 0.054207, -0.243312, 0.395584, 0.135373, -0.154469, -0.146884, -0.124623, -0.191303, -0.335308, -0.049227, 0.180136, -0.139033, -0.08948, -0.047593, 0.107337, 0.024476, -0.060683, 0.120684, -0.047293, 0.212908, -0.096038, -0.010561, 0.190388, 0.156946, 0.084654, 0.336667, -0.100012, 0.254714, 0.099684, 0.09036, 0.000622, -0.187267, -0.124457, -0.065741, -0.138789, 0.087947, 0.021308, 0.22516, -0.214352, -0.038826, 0.14637, 0.061985, 0.126884, -0.157822, -0.007404, 0.560223, -0.205442, -0.036724, -0.003191, 0.254957, -0.496987, -0.240636, -0.303415, -0.116813, -0.035472, 0.157281, 0.149667, -0.095563, -0.087157, 0.584813, -0.271721, 0.067632, 0.019143, 0.027508, -0.039895, 0.92589, 0.065712, 0.368277, -0.145034, 0.170996, 0.04759, 0.023504, -0.041855, -0.049823, 0.24502, 0.025706, 0.073399, -0.152086, 0.865838, -0.303917, -0.161134, 0.181407, 0.258602, 0.105617, 0.104758, -0.160813, 0.209571, -0.440514, -0.000226, 0.028909, -0.006462, 0.211896, 0.203555, 0.34626, 0.196947, -0.003075, -0.13851, -0.234588, -0.057049, 0.109837, -0.169899, -0.026647, 0.070245, -0.137218, -0.417331, -0.122443, -0.300148, -0.090658, 0.053913, 0.190169, -0.277542, -0.407372, -0.265521, -0.086963, 0.148955, 0.0402, -0.463445, -0.013243, -0.14838, -0.359924, -0.05342, -0.043106, 0.267329, 0.192612, -0.390534, -0.336413, 0.12804, -0.508968, -0.005241, -0.170324, 0.561319, 0.02558, -0.266326, 0.034431, 0.085732, -0.126452, 0.026064, 0.178157, -0.00819, -0.108594, -0.418644, -0.132617, 0.16933, 0.241659, 0.168523, 0.245568, 0.014617, 0.30978, 0.289057, -0.084172, 0.226792, 0.29986, -0.078191, 0.208321, 0.096998, 0.162447, 0.025501, -0.171551, -0.346596, -0.19211, -0.038941, -0.177327, -0.098693, 0.224949, -0.009969, 0.203715, -0.121216, 0.028057, -0.12071, -0.077319, 0.075736, -0.151532, 0.104288, 0.349772, -0.187026, -0.113333, 0.022016, 0.511321, -0.438772, -0.144275, -0.422994, 0.101536, -0.172234, 0.032963, 0.172692, -0.045379, 0.038964, 0.376897, -0.303933, 0.2809, 0.182646, -0.463912, 0.122996, 0.82341, 0.000602, 0.248522, 0.08973, 0.175929, -0.114705, -0.068823, 0.147896, -0.270745, -0.086738, 0.231142, 0.066401, -0.111828, 0.922434, -0.262208, -0.034936, 0.229635, 0.175017, 0.219662, 0.102026, -0.135065, -0.090169, -0.301725, 0.034577, -0.007507, 0.223415, 0.397739, 0.13848, 0.255199, 0.023115, -0.036332, -0.177219, -0.301354, -0.15497, 0.125456, -0.172204, 0.010633, -0.067597, -0.090158, -0.124195, -0.198264, -0.008217, -0.113278, -0.19694, 0.12591, -0.188493, -0.339483, -0.211913, 0.158716, 0.139225, -0.11196, -0.667167, 0.174483, -0.021607, -0.310018, -0.012143, 0.018076, 0.372521, 0.170771, -0.388705, -0.182754, 0.063152, -0.45372, -0.107341, -0.147217, 0.368616, 0.047654, -0.216777, -0.244097, 0.017675, -0.173871, -0.189264, 0.092068, -0.041365, -0.045826, -0.450902, -0.105729, 0.099152, 0.20054, 0.404807, 0.056701, -0.001968, 0.280991, 0.028877, -0.037091, 0.08261, 0.417513, -0.068432, 0.251018, 0.12502, 0.240979, 0.124953, -0.10237, -0.09644, -0.361462, 0.011587, -0.204273, -0.332409, 0.085868, 0.072035, 0.171635, 0.023729, 0.032536, 0.061214, 0.168012, 0.127831, -0.299147, 0.043483, 0.353563, -0.124216, -0.105325, -0.065076, 0.196194, -0.720735, -0.140203, -0.285369, -0.03195, 0.128127, -0.022133, 0.257323, -0.204443, -0.090257, 0.494036, -0.272612, 0.051491, 0.173965, -0.338293, 0.206775, 0.881935, 0.178577, 0.420981, 0.124638, 0.196229, -0.114591, -0.096651, 0.036273, -0.147357, 0.177206, 0.12497, 0.02008, -0.360753, 0.830267, -0.458144, -0.277232, 0.122158, 0.092351, 0.158142, -0.122964, -0.048799, 0.101415, -0.270334, -0.186325, -0.07619, 0.153655, 0.37604, 0.27871, 0.070923, 0.131578, -0.024355, -0.220659, -0.388875, -0.23311, 0.120108, -0.319365, -0.221783, -0.120941, 0.041119, -0.328785, -0.085883, -0.09128, -0.064219, -0.066734, 0.080282, -0.059883, -0.552467, -0.294738, -0.018078, 0.156356, 0.008699, -0.475295, 0.06459, -0.001505, -0.224658, 0.053252, -0.023304, 0.362258, 0.132594, -0.322503, -0.358553, 0.217205, -0.407077, 0.031637, -0.155439, 0.445643, 0.062696, -0.354499, -0.078601, 0.068168, -0.098991, -0.09179, 0.030843, 0.09077, 0.010741, -0.243238, -0.113034, 0.165792, 0.214112, 0.133601, 0.146804, -0.052317, 0.138166, 0.027318, 0.103705, 0.068354, 0.133812, 0.051929, 0.244979, 0.090601, 0.266956, 0.158987, -0.007291, -0.156078, -0.129304, 0.009912, -0.083962, -0.055221, -0.042046, 0.007399, 0.243705, -0.099236, 0.102883, -0.020526, 0.062615, 0.225136, -0.121746, 0.01545, 0.413489, -0.201283, -0.100808, -0.073512, 0.229505, -0.480901, -0.043395, -0.220033, 0.068209, -0.001268, -0.01575, 0.046175, -0.117346, -0.206933, 0.524521, -0.23814, 0.052964, 0.14585, -0.22536, 0.08269, 0.95893, 0.207215, 0.313076, -0.027781, 0.107229, -0.189706, -0.009676, 0.044138, -0.106461, 0.063717, 0.137909, -0.110773, -0.257675, 0.675752, -0.325835, -0.209145, 0.161945, 0.073277, -0.023841, 0.018044, -0.011691, 0.021107, -0.203953, -0.067915, 0.081288, 0.169477, 0.312254, 0.201993, 0.156982, 0.079535, 0.006787, -0.119504, -0.152137, -0.111578, 0.195481, -0.190876, -0.082679, 0.165161, -0.087698, -0.209171, -0.077915, 0.00939, 0.018696, -0.170661, 0.01273, -0.28516, -0.576868, -0.255324, -0.0787, 0.047617, -0.225036, -0.419998, 0.076319, 0.079312, -0.180054, 0.008811, 0.08735, 0.312124, 0.084139, -0.312728, -0.305112, 0.009533, -0.48142, 0.156972, -0.22015, 0.514062, 0.135844, -0.529262, 0.023294, 0.058117, -0.044881, -0.008869, 0.269341, 0.116693, -0.09712, -0.405904, -0.11751, -0.03966, 0.405776, 0.181284, 0.089092, -0.010806, 0.203256, -0.02926, -0.071584, 0.310004, 0.24979, -0.046595, 0.082823, 0.080556, 0.564271, 0.301048, -0.074101, -0.369688, -0.173978, -0.087578, -0.114996, 0.033984, 0.042952, 0.074347, 0.233874, 0.039086, -0.008618, -0.070844, 0.015586, 0.191029, -0.213986, 0.019005, 0.468513, -0.390884, -0.049627, -0.110471, 0.344853, -0.510983, -0.15058, -0.243553, 0.034756, -0.05818, -0.00031, 0.172387, -0.054352, 0.037851, 0.439445, -0.194577, 0.002995, 0.078182, -0.446153, 0.032954, 0.949501, 0.225944, 0.31445, -0.056742, 0.144015, -0.056973, -0.058369, 0.123472, -0.122878, 0.020616, 0.123498, 0.129417, -0.300153, 0.728079, -0.222984, -0.150824, 0.192853, 0.212245, 0.087429, 0.020714, -0.256507, -0.140011, -0.295639, -0.038229, 0.044898, 0.229757, 0.33543, 0.241344, 0.208293, 0.035699, 0.02347, -0.195037, -0.280512, -0.092428, 0.227735, -0.295284, -0.104997, 0.094475, -0.05589, -0.244981, -0.026434, -0.100525, -0.226668, 0.158619, 0.047787, 0.076747, -0.419054, -0.468047, 0.059419, -0.235986, -0.002953, -0.353273, 0.077995, -0.083859, -0.330784, -0.060553, 0.126704, 0.048461, -0.051234, -0.569127, -0.227859, 0.032911, -0.587461, 0.030986, -0.251799, 0.589532, 0.028736, -0.223877, 0.035243, -0.077804, -0.162401, 0.10607, 0.035009, -0.060805, -0.144617, -0.228898, -0.061604, 0.414062, 0.349138, 0.080874, 0.348783, 0.050471, 0.204783, 0.153239, -0.186089, 0.22445, 0.322612, -0.200773, 0.427915, -0.02872, 0.333981, 0.192765, -0.225532, -0.570548, -0.489696, 0.053015, -0.335321, -0.139591, 0.096384, -0.202922, 0.419196, -0.12275, 0.107844, 0.104302, 0.235067, 0.151304, -0.341693, 0.069564, 0.566986, -0.229371, 0.037607, -0.113666, 0.088471, -0.32325, -0.391668, -0.063094, 0.071799, -0.180129, 0.048442, 0.174535, -0.375083, 0.061097, 0.58756, -0.384331, 0.13094, 0.097535, -0.204061, 0.298686, 0.712628, 0.069757, 0.37999, 0.024745, 0.227012, -0.225555, -0.195793, 0.14617, -0.333116, 0.04574, 0.150054, 0.185082, -0.098755, 0.888412, -0.224461, -0.17649, 0.133944, 0.25635, 0.025099, 0.225192, -0.133032, -0.103909, -0.463051, 0.071684, 0.303572, 0.444179, 0.262875, 0.086978, 0.389951, 0.248585, 0.084379, 0.020504, -0.264286, 0.105239, 0.23408, -0.20066, -0.31822, -0.143055, -0.239126, 0.019094, 0.005599, 0.000767, -0.302564, -0.081305, -0.14449, -0.064516, -0.418068, -0.416521, 0.217382, 0.118143, -0.195023, -0.515439, 0.022495, -0.063481, -0.213676, 0.087568, 0.120033, 0.198331, 0.297505, -0.393445, -0.181212, 0.01192, -0.692126, -0.044167, -0.204375, 0.573989, 0.115623, -0.424706, -0.034454, 0.011799, -0.103169, -0.072074, 0.208267, -0.051246, -0.167674, -0.416493, -0.012479, 0.062772, 0.09345, 0.078093, -0.005618, -0.154713, -0.099098, -0.119557, 0.061816, 0.168337, 0.399763, -0.135843, 0.070795, 0.103839, 0.427718, 0.091432, 0.051009, -0.357335, -0.387775, 0.12021, -0.138996, -0.046228, -0.194082, 0.018491, 0.175821, 0.008585, -0.150076, -0.12252, 0.251301, 0.333333, -0.300077, 0.112636, 0.48421, -0.174692, 0.07761, -0.081551, 0.043036, -0.636975, -0.2059, -0.115325, -0.179762, 0.01942, 0.064987, 0.360587, -0.048407, -0.21868, 0.483019, -0.352786, -0.017222, 0.062976, -0.266889, 0.137651, 0.764755, 0.236461, 0.411943, 0.158621, 0.276622, -0.052611, -0.311734, -0.134203, 0.00248, 0.027201, -0.03418, 0.167285, -0.291466, 0.676248, -0.351463, -0.18903, 0.259653, 0.236896, -0.088684, -0.046904, -0.104543, -0.111824, -0.338799, -0.125162, -0.326009, 0.391324, 0.529746, 0.305305, 0.058542, 0.013564, 0.064701, -0.197963, -0.457511, -0.214172, 0.135928, -0.241822, -0.140345, 0.043432, -0.109118, -0.392917, -0.132366, -0.221558, -0.151518, -0.061679, 0.097691, -0.079105, -0.333836, -0.174748, 0.016408, 0.129304, -0.08217, -0.524238, 0.134366, 0.12029, -0.339741, 0.283991, -0.055415, 0.44876, 0.056843, -0.327322, -0.220725, 0.216253, -0.38948, -0.126863, -0.141818, 0.654771, -0.075826, -0.292402, -0.057849, 0.132898, -0.200934, -0.110227, 0.045941, 0.232408, 0.096795, -0.252932, 0.018586, 0.234311, 0.302837, 0.356668, 0.130376, -0.198009, 0.159664, 0.127239, 0.12278, 0.106152, 0.248808, 0.051775, 0.394922, 0.132658, 0.351018, -0.073841, -0.015026, -0.070723, -0.074942, -0.074611, 0.06898, 0.009866, 0.026785, 0.049373, 0.268554, -0.231307, -0.092566, -0.133646, 0.217901, 0.286266, -0.300164, 0.010532, 0.608141, -0.123737, 0.066699, -0.178477, 0.213061, -0.538515, -0.030722, -0.355721, 0.185921, 0.179979, -0.022559, 0.179337, -0.121801, -0.155462, 0.336354, -0.19378, 0.095767, 0.076875, -0.430965, -0.065584, 0.805178, -0.041468, 0.349949, -0.061569, 0.203783, -0.175429, -0.139004, -0.02438, -0.111601, 0.158579, 0.090706, 0.077661, -0.43895, 0.79932, -0.347039, -0.032665, 0.139621, 0.161041, 0.177147, 0.059385, -0.046488, 0.051802, -0.306934, -0.173078, -0.006242, 0.160074, 0.358004, 0.120667, 0.252333, 0.08118, -0.002322, -0.391012, -0.151767, -0.272316, 0.200049, -0.183469, -0.095332, 0.059041, 0.013683, -0.305977, -0.04653, -0.198045, 0.019833, -0.213719, 0.050885, -0.170957, -0.479132, -0.257974, 0.03545, 0.11396, -0.034638, -0.473828, 0.075231, 0.10333, -0.224935, -0.013011, -0.0131, 0.362523, 0.118234, -0.362592, -0.260484, 0.013859, -0.410644, 0.050267, -0.242984, 0.538721, 0.036703, -0.453155, -0.118391, 0.166942, -0.11538, -0.030916, 0.190522, 0.065237, -0.045957, -0.41924, -0.178835, 0.103413, 0.306313, 0.24808, 0.061325, -0.037556, 0.221549, 0.109214, -0.091852, 0.193044, 0.319918, -0.027607, 0.129355, 0.095185, 0.426608, 0.044668, 0.020034, -0.405508, -0.121849, -0.070735, -0.150142, -0.096599, 0.00092, 0.062061, 0.209492, -0.121461, -0.146288, -0.066235, -0.018115, 0.209493, -0.203704, 0.161706, 0.486054, -0.269519, -0.025933, -0.087143, 0.266147, -0.546468, -0.001097, -0.154017, 0.005453, -0.102836, -0.080557, 0.172409, -0.118934, -0.080649, 0.49152, -0.35198, 0.077017, 0.031805, -0.374316, 0.119281, 0.933388, 0.070497, 0.280964, 0.039116, 0.153065, -0.049956, -0.068467, 0.107621, -0.169788, 0.030741, 0.104993, 0.17403, -0.401186, 0.733077, -0.316487, -0.053363, 0.138592, 0.093575, 0.077033, 0.001569, -0.005733, -0.037521, -0.244904, -0.093476, -0.060641, 0.300968, 0.405003, 0.306823, 0.16678, -0.012958, -0.006126, -0.361579, -0.323753, -0.157971, 0.193813, -0.417827, -0.088955, 0.022664, -0.110577, -0.311725, -0.13473, -0.103663, 0.006619, -0.055007, 0.004179, -0.079507, -0.601044, -0.271854, -0.010507, 0.041326, -0.043931, -0.439731, 0.143462, -0.0474, -0.231815, -0.003607, 0.066499, 0.242668, -0.001755, -0.417887, -0.256474, 0.15595, -0.593089, -0.017857, -0.179655, 0.429413, 0.15928, -0.356336, -0.026626, 0.072283, -0.195752, -0.080045, 0.055401, -0.091238, 0.088194, -0.375184, -0.222887, 0.168539, 0.339445, 0.318157, 0.173101, 0.059031, -0.001507, 0.04475, 0.130344, 0.204488, 0.148535, 0.07434, 0.256516, 0.115146, 0.301183, 0.138692, 0.036861, -0.21802, -0.218086, -0.079343, -0.20027, -0.049736, 0.043859, -0.035974, 0.324735, -0.119733, -0.081503, -0.048835, 0.056606, 0.176576, -0.162798, 0.046404, 0.368096, -0.176483, -0.078733, -0.146908, 0.143032, -0.476485, -0.132879, -0.148826, -0.024092, 0.043819, -0.045477, 0.028067, -0.150501, -0.023518, 0.505237, -0.410387, -0.021659, 0.188298, -0.363063, 0.017835, 0.938486, 0.140007, 0.211164, 0.074486, 0.176025, -0.094846, -0.152674, 0.105167, -0.079289, 0.044144, 0.317982, 0.027808, -0.251359, 0.673147, -0.269377, -0.146555, 0.29992, 0.043805, 0.005728, 0.028602, -0.019481, 0.027522, -0.210928, -0.024866, 0.019535, 0.357749, 0.370839, 0.187062, 0.106261, 0.019501, 0.087976, -0.058226, -0.286223, -0.141639, 0.256102, -0.193239, -0.136529, -0.065361, -0.010293, -0.145204, -0.076596, -0.042081, -0.042864, -0.161441, 0.07923, -0.189005, -0.57092, -0.229667, -0.000142, 0.026703, -0.074708, -0.357938, 0.204191, 0.010405, -0.199777, -0.013087, -0.026703, 0.277435, 0.044683, -0.245623, -0.254253, 0.090549, -0.379526, 0.08371, -0.102319, 0.465182, 0.107831, -0.190813, -0.104071, 0.095602, -0.096419, 0.004923, 0.165428, -0.140529, 0.093943, -0.400396, -0.202875, 0.112039, 0.298601, 0.177449, 0.139791, 0.017155, 0.181592, 0.07643, -0.04052, 0.167168, 0.151009, 0.041025, 0.232377, -0.051797, 0.34195, 0.120338, 0.013998, -0.273628, -0.028128, 0.053894, -0.164448, -0.025765, 0.156199, 0.128677, 0.34406, -0.067341, 0.161378, -0.193003, -0.025168, 0.308159, -0.238582, 0.07311, 0.358174, -0.23544, -0.027808, -0.085819, 0.23131, -0.4558, -0.03373, -0.135512, 0.107362, 0.063933, 0.059579, 0.072035, -0.17375, -0.063714, 0.436221, -0.210269, 0.064986, 0.111895, -0.322997, 0.154059, 0.95152, 0.112023, 0.324677, 0.047423, 0.040458, -0.165101, 0.007929, 0.083491, -0.146684, 0.030213, 0.12504, -0.030139, -0.2603, 0.659827, -0.298312, -0.167839, 0.222359, 0.107713, -0.00799, -0.018656, 0.057197, -0.089846, -0.212884, -0.023933, -0.00484, 0.256506, 0.351813, 0.095775, 0.069619, 0.044895, -0.021839, -0.131961, -0.290838, -0.068762, 0.203238, -0.280011, -0.064418, 0.080196, 0.067796, -0.265584, -0.033767, -0.132076, -0.049575, -0.120313, -0.014327, -0.17787, -0.57787, -0.208184, 0.031889, 0.130268, -0.057387, -0.460469, 0.048149, 0.077106, -0.267981, -0.032872, -0.015597, 0.336814, 0.031967, -0.378707, -0.244762, 0.104978, -0.416907, -0.050181, -0.164118, 0.486031, 0.076131, -0.248614, -0.075938, 0.112576, -0.126059, -0.068059, 0.038024, -0.004646, 0.015227, -0.346996, -0.23465, 0.159325, 0.300377, 0.320169, 0.10016, 0.0225, 0.139691, 0.128201, -0.036165, 0.120283, 0.227675, -0.060407, 0.22167, 0.133785, 0.330977, 0.089273, 0.028615, -0.268454, -0.160688, 0.058292, -0.167155, -0.040229, 0.114227, 0.07323, 0.19758, -0.131178, 0.02777, -0.136496, 0.026222, 0.273018, -0.15434, 0.053217, 0.368894, -0.210779, -0.051628, -0.070419, 0.194747, -0.593398, 0.030776, -0.196893, 0.03281, 0.082744, -0.031069, 0.124999, -0.144183, -0.041606, 0.415909, -0.313744, 0.120163, 0.193521, -0.293246, 0.109839, 0.907046, 0.16696, 0.318543, 0.105727, 0.122935, -0.005863, -0.090007, 0.121893, -0.091269, -0.009414, 0.143987, 0.040676, -0.277431, 0.763246, -0.357755, -0.164332, 0.160888, 0.15248, -0.013227, 0.007646, 0.035203, -0.072212, -0.21381, -0.075424, -0.023206, 0.261801, 0.342803, 0.197101, 0.172675, 0.018592, 0.009958, -0.218538, -0.247021, -0.168346, 0.094483, -0.30542, -0.068126, 0.09024, -0.045737, -0.274662, -0.184202, -0.090184, -0.024266, -0.213743, 0.016329, -0.154195, -0.46469, -0.2204, -0.120989, 0.131031, -0.086994, -0.387867, 0.087097, 0.039624, -0.186279, 0.023561, -0.079827, 0.459531, 0.099475, -0.276579, -0.267235, 0.254966, -0.347034, 0.109057, -0.116062, 0.405239, -0.048232, -0.284882, -0.134963, 0.064118, -0.15592, -0.079114, 0.025744, -0.012319, -0.023746, -0.31138, -0.113997, 0.087433, 0.192083, 0.221168, 0.089309, 0.008313, 0.263057, -0.006628, -0.019691, 0.165068, 0.239344, -0.02862, 0.287585, 0.030196, 0.359288, 0.134456, 0.017673, -0.125919, -0.144548, -0.037256, -0.066694, -0.073131, 0.091692, 0.126655, 0.228455, -0.129041, 0.117153, -0.09926, 0.005741, 0.362735, -0.192664, 0.119039, 0.442563, -0.172219, -0.072006, -0.057931, 0.238955, -0.428795, -0.034042, -0.158179, 0.030131, 0.061259, 0.060345, 0.020224, -0.13133, 0.022956, 0.594065, -0.219414, 0.172009, 0.151159, -0.250063, -0.019743, 0.983195, 0.053768, 0.27606, 0.000205, 0.036585, -0.08061, 0.164538, 0.019115, -0.12303, 0.053191, 0.155121, 0.002319, -0.204232, 0.759599, -0.345157, -0.105275, 0.212711, 0.114738, 0.041783, 0.021911, 0.024559, 0.022007, -0.246078, -0.119811, 0.111283, 0.170539, 0.374724, 0.176707, 0.185831, -0.020369, 0.03889, -0.249496, -0.306367, -0.044997, 0.20198, -0.293622, 0.024091, 0.119004, 0.083868, -0.268997, -0.131261, -0.107147, 0.04122, -0.153925, 0.056371, -0.114602, -0.5001, -0.223287, 0.049646, 0.070746, -0.10271, -0.45542, 0.090542, 0.125979, -0.170098, 0.099001, -0.062028, 0.385795, 0.121447, -0.374284, -0.240626, 0.106801, -0.350646, -0.072205, -0.203791, 0.515182, -0.011467, -0.400866, -0.106071, 0.174289, -0.010475, -0.046733, 0.149336, -0.021266, -0.042444, -0.377191, -0.181662, 0.157997, 0.312891, 0.300316, 0.138281, -0.019001, 0.145199, 0.13119, -0.043852, 0.145311, 0.202983, 0.005287, 0.136804, 0.072978, 0.323414, 0.075757, 0.040262, -0.298108, -0.119107, 0.02395, -0.123465, -0.088793, 0.071817, 0.039424, 0.221476, -0.076754, -0.015764, -0.070013, 0.09291, 0.345106, -0.238167, 0.090008, 0.472908, -0.225073, -0.071439, -0.030431, 0.12772, -0.595529, -0.174741, -0.242914, 0.050166, -0.037671, 0.013798, 0.073714, -0.147761, -0.031313, 0.476316, -0.311281, 0.082307, 0.083056, -0.394803, 0.011314, 0.916756, -0.004498, 0.300103, -0.060031, 0.147993, -0.133072, -0.101231, -0.016348, -0.056322, 0.002627, 0.085116, -0.01344, -0.303036, 0.670767, -0.315785, -0.101351, 0.191034, 0.105196, -0.048296, -0.01477, 0.00797, 0.030148, -0.327611, -0.043632, -0.013158, 0.215424, 0.41901, 0.260657, 0.106819, 0.066672, -0.068933, -0.193087, -0.336542, -0.116848, 0.243707, -0.33234, -0.133037, 0.06172, 0.024353, -0.216921, -0.043352, -0.064554, -0.023994, -0.040064, 0.028304, -0.251212, -0.386658, -0.343781, 0.082682, 0.23729, -0.089153, -0.626854, 0.094521, 0.076692, -0.12092, 0.047902, -0.116469, 0.246952, 0.110556, -0.455724, -0.303615, 0.110658, -0.514376, 0.064554, -0.211924, 0.483672, 0.100621, -0.309734, -0.114827, 0.077751, -0.129922, -0.042616, 0.053103, -0.112506, 0.003826, -0.325649, -0.166323, 0.143524, 0.312149, 0.186821, 0.18188, 0.041438, 0.076019, -0.084268, -0.095804, 0.120441, 0.175169, 0.149207, 0.094922, 0.047515, 0.344228, 0.058854, 0.238358, -0.218735, -0.188687, -0.105772, -0.164438, -0.195768, -0.124653, 0.077305, 0.12243, -0.123084, -0.052117, -0.131667, 0.118151, 0.299456, -0.391353, 0.188879, 0.476485, -0.171132, 0.020755, 0.051484, 0.111852, -0.637295, -0.111147, -0.200273, 0.023475, -0.204771, 0.063837, 0.236879, -0.125984, -0.106218, 0.523723, -0.225598, 0.038537, 0.068795, -0.264649, 0.031462, 0.832735, 0.047389, 0.40832, -0.056168, 0.158615, -0.120373, -0.108151, -0.04995, -0.118133, 0.238003, 0.257999, 0.04378, -0.285118, 0.678109, -0.350416, -0.194558, 0.326434, 0.120577, -0.08122, -0.131165, -0.024874, 0.148097, -0.390402, 0.009778, -0.058749, 0.217562, 0.438919, 0.24025, 0.071752, 0.027243, 0.094594, -0.312101, -0.282669, -0.113119, 0.267959, -0.164117, -0.12234, 0.004493, 0.038514, -0.257082, -0.169818, -0.008464, -0.23051, -0.07236, 0.082548, -0.110399, -0.444359, -0.213252, 0.126485, 0.001227, -0.176755, -0.361307, 0.153481, 0.03414, -0.236202, 0.110233, 0.050094, 0.241277, -0.018482, -0.450127, -0.18488, 0.087063, -0.370764, 0.077537, -0.118433, 0.541852, 0.18796, -0.27122, -0.107676, 0.0834, -0.196934, 0.049143, 0.056467, -0.063454, -0.052044, -0.233132, -0.190047, 0.294275, 0.319707, 0.244612, 0.260221, -0.002249, 0.179446, 0.103188, -0.221878, 0.043676, 0.094381, -0.042069, 0.228205, 0.293484, 0.234312, 0.137863, -0.14831, -0.286861, -0.281138, -0.063649, -0.009018, -0.076392, 0.005921, 0.000721, 0.397161, -0.110292, 0.080247, -0.142796, 0.122654, 0.210733, -0.129947, 0.085469, 0.381019, -0.098502, 0.108402, -0.071677, 0.147081, -0.410109, -0.087982, -0.19818, 0.304224, -0.037941, 0.022669, 0.308071, -0.17921, 0.151184, 0.48409, -0.433399, 0.258573, 0.095362, -0.395088, 0.302725, 0.854045, 0.21546, 0.355853, 0.077043, 0.179687, -0.206635, -0.063665, 0.055657, -0.183273, 0.095305, 0.164618, -0.12851, -0.169525, 0.731504, -0.220558, -0.179772, 0.257558, 0.238363, 0.167141, 0.163434, 0.044037, -0.065114, -0.232972, 0.122372, 0.195098, 0.225209, 0.227797, 0.071307, 0.26521, 0.001829, 0.021507, -0.120317, -0.245018, -0.137621, 0.164546, -0.272257, -0.051142, 0.044221, -0.137652, -0.082872, -0.013795, -0.048463, -0.249587, -0.110013, 0.005526, -0.028993, -0.469545, -0.266298, -0.10217, 0.196606, -0.030157, -0.45766, -0.007273, 0.070887, -0.240771, 0.08931, -0.068292, 0.231221, 0.048106, -0.273168, -0.176831, 0.133575, -0.398225, -0.061179, -0.089963, 0.432054, 0.029768, -0.179535, 0.124459, -0.072897, -0.141341, -0.065683, 0.014119, 0.059902, -0.148193, -0.368307, -0.089698, 0.091676, 0.277057, 0.128954, 0.017161, -0.074992, 0.208872, 0.046901, 0.000731, 0.047376, 0.294682, 0.018153, 0.32435, 0.030036, 0.306236, 0.049723, 0.105182, -0.232509, -0.083798, 0.108593, -0.115696, -0.079353, 0.092648, 0.221704, 0.015207, -0.173292, 0.024198, 0.016146, 0.137175, 0.366974, -0.148107, 0.131438, 0.498071, -0.154942, 0.060861, -0.011639, 0.231808, -0.39321, -0.150774, -0.295095, 0.088469, 0.004311, 0.054828, -0.016339, -0.104137, 0.110733, 0.566211, -0.084473, 0.222065, 0.039821, -0.222098, 0.263346, 0.821456, 0.12276, 0.254222, -0.112178, 0.191128, -0.145414, -0.111082, -0.064383, -0.145492, 0.065188, -0.099521, -0.030977, -0.388943, 0.82268, -0.307851, -0.137831, 0.172868, 0.240505, 0.167419, 0.068848, -0.047883, -0.003391, -0.247966, -0.057044, -0.080622, 0.297816, 0.278665, 0.130474, 0.244491, 0.114558, -0.027355, -0.317212, -0.286682, -0.177232, 0.183336, -0.176007, -0.046417, 0.109212, -0.004384, -0.226952, -0.125861, -0.242716, -0.147281, -0.137082, 0.092893, -0.016394, -0.35733, -0.219176, 0.04285, 0.085036, -0.058827, -0.613757, 0.05502, 0.197419, -0.411023, 0.205623, 0.030087, 0.44594, 0.047651, -0.443077, -0.276242, 0.090845, -0.50751, -0.115728, -0.20376, 0.677248, -0.10554, -0.304255, -0.15641, 0.026556, -0.109163, -0.104007, -0.025587, 0.189726, 0.091208, -0.458364, 0.08051, 0.14318, 0.336941, 0.345143, 0.097649, -0.023114, 0.159949, 0.052818, 0.046029, 0.228599, 0.48503, -0.017399, 0.343434, 0.111623, 0.47164, -0.109192, -0.039071, -0.154431, -0.163284, -0.034185, 0.006858, -0.133018, -0.053796, 0.019603, 0.25173, -0.202725, -0.146908, -0.021796, 0.367957, 0.171764, -0.261212, 0.002342, 0.684845, -0.143313, 0.054746, -0.137854, 0.2931, -0.643513, 0.011333, -0.263861, 0.058524, 0.06142, -0.063527, 0.261804, -0.047599, -0.159404, 0.444402, -0.248528, 0.086962, 0.082758, -0.394644, 0.038797, 0.815074, -0.067432, 0.388852, -0.09435, 0.157642, -0.170117, -0.185423, 0.064881, -0.125417, 0.041747, 0.043631, 0.085716, -0.463062, 0.839557, -0.393107, -0.080155, 0.159365, 0.185302, 0.066975, 0.026376, -0.078793, 0.106427, -0.399322, -0.227818, -0.044919, 0.171047, 0.361356, 0.181016, 0.267792, 0.198424, 0.075031, -0.40664, -0.195306, -0.230746, 0.189761, -0.270807, -0.05331, -0.041097, -0.071671, -0.316509, -0.066584, -0.138895, -0.008101, -0.102445, 0.075868, -0.250279, -0.603278, -0.301121, 0.072435, -0.016748, -0.049818, -0.47968, 0.088782, 0.135567, -0.228312, 0.012924, 0.094851, 0.283529, 0.098966, -0.404477, -0.291196, 0.088863, -0.529537, 0.008407, -0.2042, 0.546704, 0.002714, -0.334713, -0.071804, 0.078494, -0.066674, -0.145905, 0.257275, -0.11636, -0.034781, -0.447853, -0.137517, 0.108466, 0.336518, 0.2405, 0.048995, 0.056321, 0.066934, 0.080852, -0.054232, 0.228743, 0.27028, 0.068408, 0.052135, 0.161182, 0.394561, 0.158711, 0.026257, -0.278622, -0.279802, -0.081935, -0.208806, -0.046708, 0.032036, -0.025102, 0.200799, -0.113571, 0.026563, -0.124683, 0.089251, 0.178709, -0.171645, 0.116404, 0.424459, -0.169488, -0.093839, -0.010849, 0.152446, -0.638508, -0.034337, -0.0609, 0.030602, 0.049867, 0.00188, 0.279276, -0.142762, -0.094304, 0.442255, -0.32599, -0.037331, 0.106083, -0.358022, 0.024095, 0.881813, 0.212541, 0.301319, 0.097805, 0.189077, -0.112985, -0.165874, 0.025308, -0.090574, 0.153389, 0.269133, 0.041747, -0.305469, 0.72558, -0.353433, -0.07445, 0.233435, 0.069201, -0.039471, -0.118724, -0.091304, -0.029209, -0.298865, -0.019777, -0.001459, 0.250206, 0.483399, 0.231964, 0.138254, 0.045146, 0.123047, -0.294242, -0.387482, -0.140996, 0.144057, -0.392296, -0.228247, -0.03266, -0.045165, -0.250543, -0.07247, -0.193221, -0.209333, -0.133759, 0.0051, -0.035216, -0.350699, -0.187143, -0.105681, 0.130929, -0.00689, -0.492283, 0.040777, 0.054741, -0.262184, 0.181068, -0.017467, 0.4333, 0.05964, -0.331614, -0.348459, 0.253363, -0.41985, -0.041632, -0.145149, 0.611162, -0.045538, -0.296061, 0.009148, 0.077596, -0.164184, -0.065562, -0.102659, 0.148133, 0.003616, -0.325789, -0.055818, 0.218059, 0.267365, 0.340298, 0.150394, -0.132348, 0.16961, 0.077635, 0.035509, 0.151996, 0.300489, -0.001083, 0.340728, 0.154483, 0.350343, -0.089141, -0.009022, -0.059579, -0.121873, 0.077433, -0.012825, -0.031586, 0.029805, 0.071347, 0.253718, -0.15961, -0.042293, -0.04304, 0.189946, 0.247071, -0.18043, 0.041071, 0.519427, -0.00708, 0.101962, -0.160208, 0.263984, -0.559804, 0.005516, -0.307515, 0.034784, 0.023774, 0.028508, 0.151439, -0.091735, -0.103641, 0.460922, -0.230899, 0.077495, 0.137245, -0.323575, 0.000818, 0.855301, 0.018962, 0.333466, -0.060434, 0.129165, -0.14897, -0.092553, 0.00521, -0.109629, 0.027975, 0.070393, 0.034127, -0.323049, 0.761561, -0.376418, -0.058751, 0.133082, 0.154404, 0.090739, 0.044615, -0.057691, 0.046276, -0.304785, -0.13943, -0.047966, 0.198765, 0.400832, 0.189962, 0.252182, 0.093654, 0.009217, -0.295916, -0.241776, -0.249348, 0.215478, -0.280781, -0.043099, 0.056866, -0.086446, -0.28542, -0.10263, -0.142845, 0.004101, -0.146207, 0.137773, 0.01991, -0.515569, -0.242367, 0.124171, 0.024829, 0.00524, -0.490995, 0.080227, 0.060379, -0.257281, 0.038834, -0.030168, 0.353876, 0.141927, -0.442391, -0.252003, 0.065879, -0.358907, -0.016175, -0.199499, 0.485906, 0.059961, -0.409193, -0.194853, 0.162179, -0.021054, -0.075677, 0.138345, -0.103567, -0.050736, -0.352136, -0.216499, 0.279856, 0.277554, 0.4036, 0.22424, -0.014046, 0.372131, 0.192706, -0.070443, 0.189751, 0.336082, -0.022999, 0.198999, 0.132697, 0.303855, 0.12071, 0.012348, -0.24362, -0.17699, 0.04682, -0.035746, -0.096208, -0.039478, 0.016349, 0.258203, 0.105167, 0.029588, 0.006325, 0.182629, 0.285359, -0.217695, 0.149036, 0.454588, -0.251955, -0.033201, -0.061591, 0.118013, -0.650982, -0.084083, -0.143202, -0.006318, -0.008569, -0.159204, 0.090598, -0.180861, -0.019799, 0.527974, -0.350236, 0.210312, 0.068968, -0.408665, 0.022297, 0.831393, 0.014007, 0.323176, 0.01664, 0.21434, -0.128606, -0.125161, -0.02576, -0.052593, 0.159241, 0.034305, -0.104404, -0.290484, 0.685, -0.224717, -0.087995, 0.09459, 0.188295, 0.054277, -0.116584, 0.039278, 0.136511, -0.349098, -0.141789, -0.095016, 0.189213, 0.538068, 0.226734, 0.122032, 0.070895, -0.003036, -0.194896, -0.273406, -0.135733, 0.33386, -0.336038, -0.127039, 0.030601, -0.006962, -0.213411, -0.077178, -0.109559, 0.081515, -0.251255, -0.162984, -0.094182, -0.472768, -0.137805, 0.082319, 0.092291, -0.090974, -0.411341, 0.056672, 0.051867, -0.242599, 0.027767, -0.013932, 0.377565, 0.058674, -0.271834, -0.257463, -0.035512, -0.303955, 0.035908, -0.310918, 0.40568, 0.071021, -0.384247, -0.024793, -0.019714, -0.099163, 0.138435, 0.202269, -0.045806, -0.064625, -0.465804, -0.229976, 0.008271, 0.422798, 0.14303, 0.14614, 0.063804, 0.265398, 0.088127, -0.029471, 0.208341, 0.390043, -0.098549, 0.085866, 0.097223, 0.533358, 0.116555, -0.020789, -0.388354, -0.042314, -0.037341, -0.26969, -0.265421, 0.050101, 0.182853, 0.142232, -0.058392, 0.125147, -0.196223, -0.062197, 0.154334, -0.285981, 0.107056, 0.542739, -0.302352, -0.119668, -0.050701, 0.242704, -0.616993, -0.024891, -0.22065, 0.091005, -0.077022, 0.012204, 0.073774, -0.047883, 0.052483, 0.586042, -0.42801, 0.128143, 0.013658, -0.573166, 0.165351, 0.818341, 0.102668, 0.232033, 0.007073, 0.102898, -0.10016, 0.080963, 0.213358, -0.183128, 0.151847, 0.187191, 0.163775, -0.228702, 0.858663, -0.429978, -0.271967, 0.236212, 0.047211, 0.082677, -0.027482, -0.08532, -0.068983, -0.169549, -0.234778, -0.081948, 0.284477, 0.508784, 0.264478, 0.21807, -0.070017, 0.020417, -0.40242, -0.503083, -0.096228, 0.29473, -0.401727, 0.022295, -0.092353, 0.032208, -0.197008, -0.091789, -0.077526, -0.000817, -0.202855, 0.005585, -0.194926, -0.594257, -0.235175, -0.008013, 0.163756, -0.203119, -0.415719, 0.118187, -0.008824, -0.219002, 0.10381, -0.030833, 0.36966, 0.086299, -0.30695, -0.321971, 0.020259, -0.578187, 0.086559, -0.26438, 0.476488, 0.144306, -0.431917, -0.044792, 0.045134, -0.15092, 0.037786, 0.12056, 0.049194, -0.084559, -0.484478, -0.243392, 0.009285, 0.376111, 0.254047, 0.260287, 0.031444, 0.19796, -0.009827, -0.008912, 0.236868, 0.233876, 0.000376, 0.081674, 0.031202, 0.450838, 0.264402, 0.036488, -0.314071, -0.153201, -0.056497, -0.118256, -0.014591, 0.081924, 0.188796, 0.236373, 0.035029, 0.023301, -0.152306, 0.027154, 0.181832, -0.296469, 0.059073, 0.488245, -0.270095, -0.013301, 0.101577, 0.237062, -0.586709, -0.178387, -0.241824, 0.074199, -0.075564, 0.049638, 0.095498, -0.031951, 0.032343, 0.50709, -0.352324, 0.089896, 0.162482, -0.499489, 0.07216, 0.793042, 0.182405, 0.214595, -0.042809, 0.129733, -0.0664, -0.026775, 0.128285, -0.132321, -0.049176, 0.073795, 0.028186, -0.205324, 0.749286, -0.367043, -0.177599, 0.167009, 0.171053, 0.072953, -0.025441, -0.093129, -0.091065, -0.291223, -0.105123, 0.04016, 0.206933, 0.406968, 0.188832, 0.255243, -0.037469, -0.122054, -0.182521, -0.315968, -0.069228, 0.27366, -0.302801, 0.025367, 0.134408, 0.028567, -0.173226] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/paper.labels\"\n",
            "    value { int64_list { value: [193, 265, 56, 193, 277, 277, 283, 83, 236, 193, 265, 193, 236, 144, 236, 97, 265, 265, 258, 258, 262, 258, 236, 236, 198, 145, 236, 265, 193, 277] } }\n",
            "  }\n",
            "  feature {\n",
            "    key: \"nodes/paper.year\"\n",
            "    value { int64_list { value: [2011, 2014, 2014, 2014, 2010, 2012, 2010, 2014, 2011, 2012, 2013, 2015, 2010, 2013, 2011, 2011, 2011, 2014, 2012, 2011, 2010, 2015, 2012, 2013, 2013, 2017, 2015, 2017, 2013, 2012] } }\n",
            "  }\n",
            "}\n"
          ]
        }
      ],
      "source": [
        "def _example_to_text(example):\n",
        "  \"\"\"MessageToString() with fewer linebreaks.\"\"\"\n",
        "  lines = [\"features {\"]\n",
        "  for k, v in sorted(example.features.feature.items()):\n",
        "    v_str = text_format.MessageToString(v, as_one_line=True,\n",
        "                                        use_short_repeated_primitives=True)\n",
        "    lines.append(\"  feature {\")\n",
        "    lines.append(f\"    key: \\\"{k}\\\"\")\n",
        "    lines.append(f\"    value {{ {v_str} }}\")\n",
        "    lines.append(\"  }\")\n",
        "  lines.append(\"}\")\n",
        "  return \"\\n\".join(lines)\n",
        "\n",
        "for serialized_example in demo_ds.take(1):\n",
        "  example = tf.train.Example.FromString(serialized_example.numpy())\n",
        "  print(_example_to_text(example))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JdA_-_qFs1yw"
      },
      "source": [
        "The following helper lets us filter input data by OGB's specific rule for the test/validation/train split before parsing the full GraphTensor. (Users working on their own datasets may want to use separate datasets to begin with.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "yMLZWmHiotWL"
      },
      "outputs": [],
      "source": [
        "def _is_in_split(split_name):\n",
        "  def filter_fn(serialized_example):\n",
        "    features = {\"years\": tf.io.RaggedFeature(value_key=\"nodes/paper.year\",\n",
        "                                             dtype=tf.int64)}\n",
        "    years = tf.io.parse_single_example(serialized_example, features)[\"years\"]\n",
        "    year = years[0]  # By convention, the root node is the first node.\n",
        "    if split_name == \"train\":  # 629,571\n",
        "      return year \u003c= 2017\n",
        "    elif split_name == \"validation\":  # 64,879\n",
        "      return year == 2018\n",
        "    elif split_name == \"test\":  # 41,939\n",
        "      return year == 2019\n",
        "    else:\n",
        "      raise ValueError(f\"Unknown split_name: '{split_name}'\")\n",
        "  return filter_fn\n",
        "\n",
        "num_training_samples = 629571\n",
        "num_validation_samples = 64879\n",
        "num_test_samples = 41939"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9RGmoV0XyBTs"
      },
      "source": [
        "### The GraphTensor type"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3WOLCADTxQZI"
      },
      "source": [
        "The cornerstone of model building with TF-GNN is the `tfgnn.GraphTensor` type. The following code cells demonstrate the essentials of using it. For more information, please see the comprehensive [Introduction to GraphTensor](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/graph_tensor.md).\n",
        "\n",
        "`tfgnn.GraphTensor` is a TensorFlow Extension Type (or \"composite tensor\") that consists of multiple Tensors but can be used as one object in a tf.data.Dataset, and in the inputs/outputs of a Keras layer or a tf.function. Other examples of this are `tf.RaggedTensor` and `tf.SparseTensor`.\n",
        "\n",
        "Every such type has a matching type spec (like a tf.Tensor has a tf.TensorSpec). A `GraphTensorSpec` can be created from the `GraphSchema` and contains information about ots node sets, their connection by edge sets, and the features on the graph. With that, we can parse `tf.Example`s as above into `GraphTensor` values:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "HeWsJEf6NBh4"
      },
      "outputs": [],
      "source": [
        "example_input_spec = tfgnn.create_graph_spec_from_schema_pb(graph_schema)\n",
        "parse = functools.partial(tfgnn.parse_single_example, example_input_spec)\n",
        "graph = parse(serialized_example)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7kOzXgYWzuYe"
      },
      "source": [
        "The `GraphTensor` is an immutable container of tensors, indexed by names. For example, here are the features and labels of all papers in this subgraph:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 4,
          "status": "ok",
          "timestamp": 1711613649035,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "7n_4Ui8t0HeR",
        "outputId": "4d5490a2-7fe1-4a32-97b6-33aa8da89a10"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "\u003ctf.Tensor: shape=(30, 1), dtype=int64, numpy=\n",
              "array([[193],\n",
              "       [265],\n",
              "       [ 56],\n",
              "       [193],\n",
              "       [277],\n",
              "       [277],\n",
              "       [283],\n",
              "       [ 83],\n",
              "       [236],\n",
              "       [193],\n",
              "       [265],\n",
              "       [193],\n",
              "       [236],\n",
              "       [144],\n",
              "       [236],\n",
              "       [ 97],\n",
              "       [265],\n",
              "       [265],\n",
              "       [258],\n",
              "       [258],\n",
              "       [262],\n",
              "       [258],\n",
              "       [236],\n",
              "       [236],\n",
              "       [198],\n",
              "       [145],\n",
              "       [236],\n",
              "       [265],\n",
              "       [193],\n",
              "       [277]])\u003e"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "graph.node_sets[\"paper\"][\"labels\"]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bzLOCN_z4rbR"
      },
      "source": [
        "By convention, the root node of the sampled subgraph is stored as the first node (index 0) of its node set, so the target label for this subgraph is the first item in the tensor above."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sLYa39Cp5xtY"
      },
      "source": [
        "A tf.data.Dataset of GraphTensors can freely be batched (and unbatched), which simply stacks (or unstacks) all the tensors that make up the individual graphs. Let's take edge set \"cites\" as an example."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 4325,
          "status": "ok",
          "timestamp": 1711613653357,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "ztkM4Ckxu0Hr",
        "outputId": "bbd70a96-c3a2-477c-8a0c-9dbbd339dcce"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\u003ctf.RaggedTensor [[0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],\n",
            " [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]\u003e\n",
            "\u003ctf.RaggedTensor [[20],\n",
            " [11, 90, 144, 167, 181, 196, 199, 201, 228, 264, 267, 287, 376, 401],\n",
            " [32, 88, 111, 175, 211, 235, 301, 309, 311, 350, 377, 521, 631, 688, 693,\n",
            "  723, 760, 770, 788, 795, 940, 1100, 1127]                               ]\u003e\n"
          ]
        }
      ],
      "source": [
        "for batched_graph in demo_ds.map(parse).batch(3).take(1):\n",
        "  print(batched_graph.edge_sets[\"cites\"].adjacency.source)\n",
        "  print(batched_graph.edge_sets[\"cites\"].adjacency.target)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nfQI0bwJ9BaX"
      },
      "source": [
        "The resulting RaggedTensors and per-graph indices are inconvenient for the modeling code. Hence GraphTensor allows to merge a batch of graphs into a single graph with contiguous indexing. The distinction between the inputs is preserved by the notion of **components** inside the merged graph: all node sets and edge sets keep track of the size of each component. (This is always true, there just happens to be a single component in each graph before merging.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 370,
          "status": "ok",
          "timestamp": 1711613653710,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "yPgKVJ4Mu0Ez",
        "outputId": "e7eee331-6cb0-449f-8150-9d6c0bc04921"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "tf.Tensor([ 1 14 23], shape=(3,), dtype=int32)\n",
            "tf.Tensor(\n",
            "[  0  30  30  30  30  30  30  30  30  30  30  30  30  30  30 503 503 503\n",
            " 503 503 503 503 503 503 503 503 503 503 503 503 503 503 503 503 503 503\n",
            " 503 503], shape=(38,), dtype=int32)\n",
            "tf.Tensor(\n",
            "[  20   41  120  174  197  211  226  229  231  258  294  297  317  406\n",
            "  431  535  591  614  678  714  738  804  812  814  853  880 1024 1134\n",
            " 1191 1196 1226 1263 1273 1291 1298 1443 1603 1630], shape=(38,), dtype=int32)\n",
            "tf.Tensor([  30  473 1308], shape=(3,), dtype=int32)\n"
          ]
        }
      ],
      "source": [
        "merged_graph = batched_graph.merge_batch_to_components()\n",
        "print(merged_graph.edge_sets[\"cites\"].sizes)\n",
        "print(merged_graph.edge_sets[\"cites\"].adjacency.source)\n",
        "print(merged_graph.edge_sets[\"cites\"].adjacency.target)\n",
        "print(merged_graph.node_sets[\"paper\"].sizes)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ALxg9MxsNjQY"
      },
      "source": [
        "Features on the node sets or edge sets of the merged graph have the set's total size as their first dimension, followed by the dimensions of individual feature values. This makes them compatible with many standard layers like tf.keras.layers.Dense, which accept an unknown batch size as the first dimension, except the \"batch size\" of a node feature is the total number of nodes in the batch.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 20,
          "status": "ok",
          "timestamp": 1711613653710,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "Q5mYSUYIODpR",
        "outputId": "a7ec92a7-40e1-4dc2-e43c-eadd4940072b"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "1811\n",
            "(1811, 128)\n"
          ]
        }
      ],
      "source": [
        "print(merged_graph.node_sets[\"paper\"].total_size.numpy())\n",
        "print(merged_graph.node_sets[\"paper\"][\"feat\"].shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bEIQsona-hLG"
      },
      "source": [
        "**Please remember:** Each TF-GNN Model needs to call `.merge_batch_to_components()` at one point after the final input batches for each model replica have been formed but before the actual GNN model starts. For TPUs, this and the subsequent padding to fixed sizes has to happen before data is fed into the trained Model."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "o4tTFEnui6he"
      },
      "source": [
        "## Model building and training"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "n7dDGuQOFYqK"
      },
      "source": [
        "We use TensorFlow's [Distribution Strategy](https://www.tensorflow.org/guide/distributed_training) API to write a model that can train in parallel on multiple [Cloud TPUs](https://cloud.google.com/tpu), multiple GPUs, or maybe just locally on CPU. A distribution strategy is not required for a single GPU or CPU, but we might as well show how it's done for the general case.\n",
        "\n",
        "For CloudTPU, the following code assumes the Colab runtime type \"TPU v2\", that is, a TPU VM. Do not use the runtime type \"TPU (deprecated)\", which uses a TPU Node on a separate VM."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 16,
          "status": "ok",
          "timestamp": 1711613653711,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "5Te_iGkVwYPB",
        "outputId": "615c099a-f0de-4993-a49a-f7235697c676"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Using MirroredStrategy for GPUs\n",
            "GPU 0: Tesla T4 (UUID: GPU-17a45941-9927-e24a-29e6-ba2ba8d0505c)\n",
            "Found 1 replicas in sync\n"
          ]
        }
      ],
      "source": [
        "if tf.config.list_physical_devices(\"TPU\"):\n",
        "  print(\"Using TPUStrategy\")\n",
        "  tpu_resolver = tf.distribute.cluster_resolver.TPUClusterResolver(\"local\")\n",
        "  tf.config.experimental_connect_to_cluster(tpu_resolver)\n",
        "  tf.tpu.experimental.initialize_tpu_system(tpu_resolver)\n",
        "  strategy = tf.distribute.TPUStrategy(tpu_resolver)\n",
        "  assert isinstance(strategy, tf.distribute.TPUStrategy)\n",
        "elif tf.config.list_physical_devices(\"GPU\"):\n",
        "  print(f\"Using MirroredStrategy for GPUs\")\n",
        "  gpu_list = !nvidia-smi -L\n",
        "  print(\"\\n\".join(gpu_list))\n",
        "  strategy = tf.distribute.MirroredStrategy()\n",
        "else:\n",
        "  print(f\"Using default strategy\")\n",
        "  strategy = tf.distribute.get_strategy()\n",
        "print(f\"Found {strategy.num_replicas_in_sync} replicas in sync\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FH19bnQ8yRro"
      },
      "source": [
        "### Padding (for TPUs)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Src-YciELn-a"
      },
      "source": [
        "Training on Cloud TPUs involves just-in-time compilation of a TensorFlow model to TPU code, and requires fixed shapes for all Tensors involved. To achieve that for graph data with variable numbers of nodes and edges, we need to pad each input Tensor to some fixed maximum size. For training on GPUs or CPU, this extra step is not necessary."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 260,
          "status": "ok",
          "timestamp": 1711613653966,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "06AMoGjTJmUg",
        "outputId": "12198e0f-cab0-463b-c375-f7d6aedf0a18"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Padding is OFF\n"
          ]
        }
      ],
      "source": [
        "#@title Pad batched GraphTensors to fixed sizes?\n",
        "#@markdown By default (`None`), padding is used for TPUs only.\n",
        "use_padding = None #@param [\"None\", \"True\", \"False\"] {type:\"raw\"}\n",
        "if use_padding is None:\n",
        "  use_padding = isinstance(strategy, tf.distribute.TPUStrategy)\n",
        "print(\"Padding is\", [\"OFF\", \"ON\"][use_padding])\n",
        "if isinstance(strategy, tf.distribute.TPUStrategy) and not use_padding:\n",
        "  raise ValueError(\"Padding is required for running on TPU\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A2jr5ywaLcku"
      },
      "source": [
        "For the validation dataset, we need to make sure that every batch of examples fits within the fixed sizes, no matter how the parallelism in the input pipeline ends up combining examples into batches. Therefore, we use a rather generous estimate, basically scaling each Tensor's observed maximum size by a factor of batch_size. If that were to run into limitations of accelerator memory, we'd rather shrink the batch size than lose examples.\n",
        "\n",
        "The dataset in this example is not too big, so we can scan it within a few minutes to determine constraints large enough for all inputs. (For huge datasets under your control, it may be worth inferring an upper bound from the sampling spec instead.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 10,
          "status": "ok",
          "timestamp": 1711613653967,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "L8Mny61cJmJx",
        "outputId": "2e2edd3a-4634-4fa4-f301-a8d99ad09cfd"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Validation uses a global batch size of 128 (128 per replica).\n"
          ]
        }
      ],
      "source": [
        "global_batch_size = 128\n",
        "validation_global_batch_size = global_batch_size\n",
        "assert validation_global_batch_size % strategy.num_replicas_in_sync == 0, \"divisibility required\"\n",
        "validation_replica_batch_size = validation_global_batch_size // strategy.num_replicas_in_sync\n",
        "print(f\"Validation uses a global batch size of {validation_global_batch_size} \"\n",
        "      f\"({validation_replica_batch_size} per replica).\")\n",
        "validation_size_constraints = None\n",
        "if use_padding:\n",
        "  # The \"paper\" node set needs at least one node in each graph component,\n",
        "  # incl. those added for padding, because the model will read out the state\n",
        "  # of the sampled subgraph's root node from each component.\n",
        "  min_nodes_per_component = {\"paper\": 1}\n",
        "  validation_size_constraints = tfgnn.find_tight_size_constraints(\n",
        "      _get_dataset(input_file_pattern, shuffle=False,\n",
        "                  filter_fn=_is_in_split(\"validation\"),  # For OGB only.\n",
        "      ).map(parse),\n",
        "      target_batch_size=validation_replica_batch_size,\n",
        "      min_nodes_per_component=min_nodes_per_component)\n",
        "  print(f\"Validation data is padded to: {validation_size_constraints}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NEDV6pFkLY4k"
      },
      "source": [
        "For the training dataset, TF-GNN allows you to optimize more aggressively for large batch sizes: size constraints satisfied by 100% of all possible inputs would have to accommodate the rare combination of many large examples in one batch.\n",
        "\n",
        "Instead, we use size constraints that will fit *close to* 100% of the randomly drawn training batches. This is not covered by the theory supporting stochastic gradient descent (which calls for examples drawn independently at random), but in practice, it often works, and allows larger batch sizes within the limits of accelerator memory, and hence faster convergence of the training. Running the code block below with `use_padding=True` shows the sizes required for some success ratios. (Compare them to the constraints above, which are satisfied 100% at a fourth of the batch size.)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 8,
          "status": "ok",
          "timestamp": 1711613653967,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "ulDdkMmFJmMQ",
        "outputId": "5d2f5963-d9ab-4c27-a529-c87893c3af0c"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Training uses a batch size of 128 (128 per replica).\n"
          ]
        }
      ],
      "source": [
        "training_global_batch_size = global_batch_size\n",
        "assert training_global_batch_size % strategy.num_replicas_in_sync == 0, \"divisibility required\"\n",
        "training_replica_batch_size = training_global_batch_size // strategy.num_replicas_in_sync\n",
        "print(f\"Training uses a batch size of {training_global_batch_size} \"\n",
        "      f\"({training_replica_batch_size} per replica).\")\n",
        "training_size_constraints = None\n",
        "if use_padding:\n",
        "  success_ratios = [0.90, 0.98, 0.99]\n",
        "  constraints = tfgnn.learn_fit_or_skip_size_constraints(\n",
        "      _get_dataset(input_file_pattern,\n",
        "                   filter_fn=_is_in_split(\"train\"),  # For OGB only.\n",
        "      ).map(parse),\n",
        "      training_replica_batch_size,\n",
        "      min_nodes_per_component=min_nodes_per_component,\n",
        "      success_ratio=success_ratios, sample_size=20000)\n",
        "  for sr_idx, sr in enumerate(success_ratios):\n",
        "    print(f\"Success ratio {sr} requires: {constraints[sr_idx]}\")\n",
        "  sr_idx = 2\n",
        "  training_size_constraints = constraints[sr_idx]\n",
        "  print(f\"\\nSelected success ratio: {success_ratios[sr_idx]}.\")\n",
        "  print(f\"Training data is padded to: {training_size_constraints}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-w491kLYjGK2"
      },
      "source": [
        "### Preprocessing the input features"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4HF-9ZkYGtHr"
      },
      "source": [
        "As usual in TensorFlow, the non-trainable transformations of the input features are split off into a `Dataset.map()` call while the model proper consists of the trainable and accelerator-compatible parts. However, even this non-trainable part is put into a Keras model, which is a convenient way to track resources for export (such as lookup tables). The following code cells provide a fully worked example; for more background information, please see our [Input pipeline](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/input_pipeline.md) guide."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "P00doywlnP6M"
      },
      "outputs": [],
      "source": [
        "# Callbacks used by tfgnn.keras.layers.MapFeatures below.\n",
        "\n",
        "def _preprocess_node_features(node_set, *, node_set_name):\n",
        "  if node_set_name == \"paper\":\n",
        "    return {\n",
        "        # Retain the word2vec embedding unchanged.\n",
        "        \"feat\": node_set[\"feat\"],\n",
        "        # Keep the label, until popped later on.\n",
        "        \"labels\": node_set[\"labels\"]}\n",
        "  elif node_set_name == \"author\":\n",
        "    # There are no useful features. Instead, we create a tensor of hidden node\n",
        "    # states that are empty, i.e., with shape [batch_size, (num_nodes), 0].\n",
        "    return {\"empty_state\": tfgnn.keras.layers.MakeEmptyFeature()(node_set)}\n",
        "  elif node_set_name == \"field_of_study\":\n",
        "    # Convert the string id to an index into an embedding table.\n",
        "    # Conveniently, this Keras layer can handle RaggedTensors.\n",
        "    return {\"hashed_id\": tf.keras.layers.Hashing(num_bins=50_000)(\n",
        "        node_set[\"#id\"])}\n",
        "  elif node_set_name == \"institution\":\n",
        "    # Convert the string id to an index into an embedding table.\n",
        "    return {\"hashed_id\": tf.keras.layers.Hashing(num_bins=6_500)(\n",
        "        node_set[\"#id\"])}\n",
        "  else:\n",
        "    raise KeyError(f\"Unexpected node_set_name='{node_set_name}'\")\n",
        "\n",
        "def _drop_all_features(graph_piece, **unused_kwargs):\n",
        "  return {}"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "oIgcbUmgnP2o"
      },
      "outputs": [],
      "source": [
        "def _make_preprocessing_model(graph_tensor_spec, size_constraints):\n",
        "  \"\"\"Returns Keras model to preprocess a batched and parsed GraphTensor.\"\"\"\n",
        "  graph = input_graph = tf.keras.layers.Input(type_spec=graph_tensor_spec)\n",
        "\n",
        "  # Convert input features to suitable representations for use on GPU/TPU.\n",
        "  # Drop unused features (like id strings for tracking the source of examples).\n",
        "  graph = tfgnn.keras.layers.MapFeatures(\n",
        "      node_sets_fn=_preprocess_node_features,\n",
        "      edge_sets_fn=_drop_all_features,\n",
        "      context_fn=_drop_all_features)(graph)\n",
        "  assert \"labels\" in graph.node_sets[\"paper\"].features\n",
        "\n",
        "  ### IMPORTANT: All TF-GNN modeling code assumes a GraphTensor of shape []\n",
        "  ### in which the graphs of the input batch have been merged to components of\n",
        "  ### one contiguously indexed graph. There are no edges between components,\n",
        "  ### so no information flows between them.\n",
        "  graph = graph.merge_batch_to_components()\n",
        "\n",
        "  # Optionally, pad to size_constraints (required for TPU).\n",
        "  if size_constraints:\n",
        "    graph, mask = tfgnn.keras.layers.PadToTotalSizes(size_constraints)(graph)\n",
        "  else:\n",
        "    mask = None\n",
        "\n",
        "  # Split the label off the padded input graph.\n",
        "  root_label = tfgnn.keras.layers.ReadoutFirstNode(\n",
        "      node_set_name=\"paper\", feature_name=\"labels\")(graph)\n",
        "  graph = graph.remove_features(node_sets={\"paper\": [\"labels\"]})\n",
        "  assert \"labels\" not in graph.node_sets[\"paper\"].features\n",
        "\n",
        "  outputs = (graph, root_label) if mask is None else (graph, root_label, mask)\n",
        "  return tf.keras.Model(input_graph, outputs)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EOc6UbeoEVKM"
      },
      "source": [
        "The preprocessing model merges its batch of input graphs into one contiguously indexed graph with multiple components (as explained above), and then pads it (if applicable). After that, the dataset can no longer be rebatched to split it up between replicas, so we use `tf.distribute.Strategy.distribute_datasets_from_function()` to build a `DistributedDataset` of per-replica inputs. (If it weren't for TPUs or distribution strategies, the call to `merge_batch_to_components()` could be deferred to the start of the trained model, and we could leave it to `Model.fit()` to split up a single Dataset.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 9470,
          "status": "ok",
          "timestamp": 1711613663432,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "Me3cV-Ws1MHJ",
        "outputId": "54877871-5950-4862-ad16-d30ac376c941"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:Mapping types may not work well with tf.nest. Prefer using MutableMapping for \u003cclass 'tensorflow_gnn.graph.graph_tensor._ImmutableMapping'\u003e\n",
            "WARNING:tensorflow:Mapping types may not work well with tf.nest. Prefer using MutableMapping for \u003cclass 'tensorflow_gnn.graph.graph_tensor._ImmutableMapping'\u003e\n"
          ]
        }
      ],
      "source": [
        "example_input_spec = tfgnn.create_graph_spec_from_schema_pb(graph_schema)\n",
        "\n",
        "def _get_preprocessed_dataset(\n",
        "    input_context, split_name, per_replica_batch_size, size_constraints):\n",
        "  training = split_name == \"train\"\n",
        "  ds = _get_dataset(input_file_pattern, shuffle=training,\n",
        "                    filter_fn=_is_in_split(split_name),  # For OGB only.\n",
        "                    input_context=input_context)\n",
        "  if training:\n",
        "    ds = ds.repeat()\n",
        "  # There is no need to drop_remainder when batching, even for TPU:\n",
        "  # padding the GraphTensor can handle variable numbers of components.\n",
        "  ds = ds.batch(per_replica_batch_size)\n",
        "  ds = ds.map(tfgnn.keras.layers.ParseExample(example_input_spec))\n",
        "  if training and size_constraints:\n",
        "    ds = ds.filter(functools.partial(tfgnn.satisfies_size_constraints,\n",
        "                                     total_sizes=size_constraints))\n",
        "  ds = ds.map(_make_preprocessing_model(ds.element_spec, size_constraints),\n",
        "              deterministic=False, num_parallel_calls=tf.data.AUTOTUNE)\n",
        "  return ds\n",
        "\n",
        "def _get_distributed_preprocessed_dataset(\n",
        "    strategy, split_name, per_replica_batch_size, size_constraints):\n",
        "  \"\"\"Returns DistributedDataset and its per-replica element_spec.\"\"\"\n",
        "  return strategy.distribute_datasets_from_function(functools.partial(\n",
        "      _get_preprocessed_dataset,\n",
        "      split_name=split_name, per_replica_batch_size=per_replica_batch_size,\n",
        "      size_constraints=size_constraints))\n",
        "\n",
        "train_ds = _get_distributed_preprocessed_dataset(\n",
        "    strategy, \"train\",\n",
        "    training_replica_batch_size, training_size_constraints)\n",
        "valid_ds = _get_distributed_preprocessed_dataset(\n",
        "    strategy, \"validation\",\n",
        "    validation_replica_batch_size, validation_size_constraints)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ByIFuiI40SfJ"
      },
      "source": [
        "### The GNN core of the model\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2_NdZCxtH2Ou"
      },
      "source": [
        "We are going to use the same Model object for training, validation, and export for inference, so we need to build it from an input type spec with generic tensor shapes. (For TPUs, using it on a *dataset* with fixed-size elements will suffice.) Rather than spelling out the type spec by hand, we create a non-distributed, non-padded dummy dataset whose element spec reflects the preprocessing. (That's cheap, as long as we don't create an iterator on it.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "T1Km8Fc5j6fY"
      },
      "outputs": [],
      "source": [
        "build_model_graph_tensor_spec, *_ = _get_preprocessed_dataset(\n",
        "    input_context=None, split_name=\"train\",\n",
        "    per_replica_batch_size=2, size_constraints=None).element_spec"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "b7fhATRUexKh"
      },
      "outputs": [],
      "source": [
        "def _build_model(\n",
        "    # To be called with the build_model_graph_tensor_spec from above.\n",
        "    graph_tensor_spec,\n",
        "    # Dimensions of initial states.\n",
        "    field_of_study_dim=32,\n",
        "    institution_dim=16,\n",
        "    paper_dim=512,\n",
        "    # Dimensions for message passing.\n",
        "    message_dim=128,\n",
        "    next_state_dim=128,\n",
        "    # Dimension for the logits.\n",
        "    num_classes=349,\n",
        "    # Other hyperparameters.\n",
        "    l2_regularization=6e-6,\n",
        "    dropout_rate=0.2,\n",
        "    use_layer_normalization=True,\n",
        "):\n",
        "  # Model building with Keras's Functional API starts with an input object\n",
        "  # (a placeholder for future inputs). This works for composite tensors, too.\n",
        "  graph = input_graph = tf.keras.layers.Input(type_spec=graph_tensor_spec)\n",
        "\n",
        "  # The initial hidden states for each node set.\n",
        "  def set_initial_node_state(node_set, *, node_set_name):\n",
        "    if node_set_name == \"paper\":\n",
        "      # A trainable transformation of the word2vec input features.\n",
        "      return tf.keras.layers.Dense(paper_dim)(node_set[\"feat\"])\n",
        "    elif node_set_name == \"author\":\n",
        "      # The empty initial state for each node was created in preprocessing\n",
        "      # and now comes out here with the correct shape (fixed in case of TPU).\n",
        "      return node_set[\"empty_state\"]\n",
        "    elif node_set_name == \"field_of_study\":\n",
        "      # A trainable embedding (as discussed above).\n",
        "      return tf.keras.layers.Embedding(50_000, field_of_study_dim)(\n",
        "          node_set[\"hashed_id\"])\n",
        "    elif node_set_name == \"institution\":\n",
        "      # A trainable embedding (as discussed above).\n",
        "      return tf.keras.layers.Embedding(6_500, institution_dim)(\n",
        "          node_set[\"hashed_id\"])\n",
        "    else:\n",
        "      raise KeyError(f\"Unexpected node_set_name='{node_set_name}'\")\n",
        "  graph = tfgnn.keras.layers.MapFeatures(\n",
        "      node_sets_fn=set_initial_node_state, name=\"init_states\")(graph)\n",
        "\n",
        "  # Abbreviations for repeated building blocks in the GNN.\n",
        "  def dense(units, *, use_layer_normalization=False):\n",
        "    \"\"\"A Dense layer with regularization (L2 and Dropout) and normalization.\"\"\"\n",
        "    regularizer = tf.keras.regularizers.l2(l2_regularization)\n",
        "    result = tf.keras.Sequential([\n",
        "        tf.keras.layers.Dense(\n",
        "            units,\n",
        "            activation=\"relu\",\n",
        "            use_bias=True,\n",
        "            kernel_regularizer=regularizer,\n",
        "            bias_regularizer=regularizer),\n",
        "        tf.keras.layers.Dropout(dropout_rate)])\n",
        "    if use_layer_normalization:\n",
        "      result.add(tf.keras.layers.LayerNormalization())\n",
        "    return result\n",
        "\n",
        "  def convolution(message_dim, receiver_tag):\n",
        "    return tfgnn.keras.layers.SimpleConv(dense(message_dim), \"sum\",\n",
        "                                         receiver_tag=receiver_tag)\n",
        "\n",
        "  def next_state(next_state_dim, use_layer_normalization):\n",
        "    return tfgnn.keras.layers.NextStateFromConcat(dense(next_state_dim, use_layer_normalization=use_layer_normalization))\n",
        "\n",
        "  # The GNN \"core\" of the model.\n",
        "  # Convolutions let data flow towards the specified endpoint of edges, e.g.,\n",
        "  # along \"cites\" edges from TARGET (cited paper) to SOURCE (citing paper).\n",
        "  # See the text below the colab cell for more explanations.\n",
        "  for i in range(4):\n",
        "    graph = tfgnn.keras.layers.GraphUpdate(node_sets={\n",
        "        \"paper\": tfgnn.keras.layers.NodeSetUpdate(\n",
        "            {\"cites\": convolution(message_dim, tfgnn.SOURCE),\n",
        "             \"written\": convolution(message_dim, tfgnn.SOURCE),\n",
        "             \"has_topic\": convolution(message_dim, tfgnn.SOURCE)},\n",
        "            next_state(next_state_dim, use_layer_normalization)),\n",
        "         \"author\": tfgnn.keras.layers.NodeSetUpdate(\n",
        "            {\"writes\": convolution(message_dim, tfgnn.SOURCE),\n",
        "             \"affiliated_with\": convolution(message_dim, tfgnn.SOURCE)},\n",
        "            next_state(next_state_dim, use_layer_normalization)),\n",
        "         })(graph)\n",
        "\n",
        "  # Read out the hidden state of the root node of each **component** in the\n",
        "  # graph (cf. .merge_batch_to_components() above).\n",
        "  root_states = tfgnn.keras.layers.ReadoutFirstNode(node_set_name=\"paper\")(graph)\n",
        "  # Put a linear classifier on top. (Never use dropout here.)\n",
        "  logits = tf.keras.layers.Dense(num_classes)(root_states)\n",
        "\n",
        "  return tf.keras.Model(input_graph, logits)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jd02cyRB5DP1"
      },
      "source": [
        "At its core, the GNN model above consists of multiple rounds of graph updates, each expressed in Keras as\n",
        "\n",
        "```\n",
        "graph = GraphUpdate(...)(graph)\n",
        "```\n",
        "\n",
        "with a `graph` of type `GraphTensor`.\n",
        "\n",
        "Several of the [tensorflow_gnn/models](https://github.com/tensorflow/gnn/tree/main/tensorflow_gnn/models) provide ready-to-use `GraphUpdate` classes. The basic \"[OGBN-MAG end-to-end](https://colab.research.google.com/github/tensorflow/gnn/blob/master/examples/notebooks/ogbn_mag_e2e.ipynb)\" tutorial demonstrates the use of the [VanillaMPNNGraphUpdate](https://github.com/tensorflow/gnn/tree/main/tensorflow_gnn/models/vanilla_mpnn). In this advanced tutorial, we dig deeper and explicitly describe how such a graph update can be built with the generic `tfgnn.keras.layers.GraphUpdate` class.\n",
        "\n",
        "In the provided example, we perform four rounds of graph updates. A graph update consists of updates to node sets, that is to say, replacing the `\"hidden_state\"` (` == tfgnn.HIDDEN_STATE`) feature on each of them. The node set updates of one graph update happen in parallel, that is, on the same input `graph`.\n",
        "\n",
        "Each node set update is expressed as a `NodeSetUpdate` layer, which receives the input `graph` and returns a new hidden state for the node set it gets applied to. The new hidden state is computed with the given next-state layer from the node set's prior state and the aggregated results from each incoming edge set.\n",
        "\n",
        "For example, each round of the model above computes a new state for node set \"paper\" by applying `dense(next_state_dim)` to the concatenation of\n",
        "\n",
        "  * the prior state `graph.node_sets[\"paper\"][tfgnn.HIDDEN_STATE]`,\n",
        "  * the result of `convolution(message_dim)(graph, edge_set_name=\"cites\")`,\n",
        "  * the result of `convolution(message_dim)(graph, edge_set_name=\"written\")` and\n",
        "  * the result of `convolution(message_dim)(graph, edge_set_name=\"has_topic\")`.\n",
        "\n",
        "A convolution on an edge set computes a value for each edge (a \"message\") as a trainable function of the node states at both endpoints, and then aggregates the results at the receiver nodes by forming the sum (or mean or max) over all incoming edges.\n",
        "\n",
        "For example, the convolution on edge set \"written\" concatenates the node state of each edge's incident \"paper\" and \"author\" node, applies `dense(message_dim)`, and sums the results over the edges incident to each \"paper\" node (that is, at the `SOURCE` node of each edge).\n",
        "\n",
        "Notice that the conventional names *source* and *target* for the endpoints of a directed edge do **not** prescribe the direction of information flow: each \"written\" edge logically goes from a paper to its author (so the \"author\" node is its `TARGET`), yet this model lets the data flow towards the paper (and the \"paper\" node is its `SOURCE`). In fact, sampled subgraphs have edges directed away from the root node, so data flow towards the root often goes from `TARGET` to `SOURCE`.\n",
        "\n",
        "\u003e Note on terminology: Convolutions are best known in deep learning for convolutional neural networks on image data, in which they aggregate information from a small, fixed, implicitly understood neighborhood of each element in a pixel grid. The term loosely carries over to graphs by interpreting edges as explicit, variable definitions of a node's neighborhood.\n",
        "\n",
        "The code above creates fresh Convolution and NextState layer objects for each edge set and node set, resp., and for each round of updates. This means they all have separate trainable weights. If desired, weight sharing is possible in the standard Keras way by sharing convolution and next-state layer objects, provided the input sizes match.\n",
        "\n",
        "For more information on defining your own GNN models (including those with edge and context states), please refer to the [TF-GNN Modeling Guide](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/gnn_modeling.md).\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "W_Pl5XWwZ4cn"
      },
      "source": [
        "### Training the model\n",
        "\n",
        "To train the Keras Model, as usual, we build it under the distribution strategy scope, compile and fit it.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "VcrTzMEfexIm"
      },
      "outputs": [],
      "source": [
        "with strategy.scope():\n",
        "  model = _build_model(build_model_graph_tensor_spec)\n",
        "  loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n",
        "  metrics = [tf.keras.metrics.SparseCategoricalAccuracy(),\n",
        "             tf.keras.metrics.SparseCategoricalCrossentropy(from_logits=True)]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wDUbz6FQDZqP"
      },
      "source": [
        "**IMPORTANT:** In `model.compile()`, pass `weighted_metrics=[...]` instead of plain `metrics=[...]` if your code ever gets used with GraphTensors that are padded to fixed sizes. Otherwise the padding mask is not applied in metric computations, so that padding values interfere with the metrics (making them worse than they really are)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5A-AiYVp4DPS"
      },
      "source": [
        "Training for 5 epochs of sampled subgraphs takes a few hours on a colab with one GPU and should achieve an accuracy above 0.47. Lucky runs with many more epochs can reach 0.51. (Training with a Cloud TPU runtime is faster, but note that this set-up is not optimized specifically for TPUs.)\n",
        "\n",
        "To keep this demo more interactive, we let Keras train and evaluate on fractions of a true epoch for a few minutes only. Of course the resulting accuracy will be poor. To fix that, feel free to edit the `epoch_divisor` according to your patience and ambition. ;-)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 213704,
          "status": "ok",
          "timestamp": 1711613886387,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "b6x57RqJ2Pi-",
        "outputId": "ba78f7fd-3159-43cb-a393-bd4dc009f7ef"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Epoch 1/5\n"
          ]
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n",
            "WARNING:tensorflow:Gradients do not exist for variables ['layer_normalization_7/gamma:0', 'layer_normalization_7/beta:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?\n"
          ]
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "49/49 [==============================] - 77s 2s/step - loss: 5.2896 - sparse_categorical_accuracy: 0.0584 - sparse_categorical_crossentropy: 5.2613 - val_loss: 5.2338 - val_sparse_categorical_accuracy: 0.0297 - val_sparse_categorical_crossentropy: 5.2064\n",
            "Epoch 2/5\n",
            "49/49 [==============================] - 40s 807ms/step - loss: 4.9051 - sparse_categorical_accuracy: 0.0829 - sparse_categorical_crossentropy: 4.8783 - val_loss: 4.8596 - val_sparse_categorical_accuracy: 0.0719 - val_sparse_categorical_crossentropy: 4.8332\n",
            "Epoch 3/5\n",
            "49/49 [==============================] - 32s 662ms/step - loss: 4.6366 - sparse_categorical_accuracy: 0.1089 - sparse_categorical_crossentropy: 4.6104 - val_loss: 4.6487 - val_sparse_categorical_accuracy: 0.1063 - val_sparse_categorical_crossentropy: 4.6227\n",
            "Epoch 4/5\n",
            "49/49 [==============================] - 31s 635ms/step - loss: 4.5127 - sparse_categorical_accuracy: 0.1189 - sparse_categorical_crossentropy: 4.4867 - val_loss: 4.4764 - val_sparse_categorical_accuracy: 0.0922 - val_sparse_categorical_crossentropy: 4.4505\n",
            "Epoch 5/5\n",
            "49/49 [==============================] - 31s 628ms/step - loss: 4.4421 - sparse_categorical_accuracy: 0.1288 - sparse_categorical_crossentropy: 4.4162 - val_loss: 4.5154 - val_sparse_categorical_accuracy: 0.0969 - val_sparse_categorical_crossentropy: 4.4895\n"
          ]
        }
      ],
      "source": [
        "epoch_divisor = 100  # To speed up the interactive demo_ds\n",
        "steps_per_epoch = num_training_samples // global_batch_size // epoch_divisor\n",
        "validation_steps = num_validation_samples // global_batch_size  // epoch_divisor\n",
        "epochs = 5\n",
        "learning_rate = tf.keras.optimizers.schedules.CosineDecay(0.001, steps_per_epoch*epochs)\n",
        "with strategy.scope():\n",
        "  model.compile(tf.keras.optimizers.Adam(learning_rate=learning_rate),\n",
        "                loss=loss, weighted_metrics=metrics, steps_per_execution=20)\n",
        "\n",
        "  history = model.fit(\n",
        "      train_ds,\n",
        "      steps_per_epoch=steps_per_epoch,\n",
        "      epochs=epochs,\n",
        "      validation_data=valid_ds,\n",
        "      validation_steps=validation_steps,\n",
        "  )"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8qoAvUP7Kkzh"
      },
      "source": [
        "## Export for inference"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PP2p6NVAKpXR"
      },
      "source": [
        "At the end of training, a SavedModel is exported for inference. C++ inference environments like TF Serving do not support input of extension types like GraphTensor, so we export the model with a SavedModel Signature that accepts a batch of serialized tf.Examples and preprocesses them like training did."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "MvgwxJDut9Ca"
      },
      "outputs": [],
      "source": [
        "serving_input = tf.keras.layers.Input(shape=[], dtype=tf.string, name=\"examples\")\n",
        "preproc_input = tfgnn.keras.layers.ParseExample(example_input_spec)(serving_input)\n",
        "preproc_model = _make_preprocessing_model(preproc_input.type_spec,\n",
        "                                          size_constraints=None)\n",
        "model_input, _ = preproc_model(preproc_input)  # Drop labels. (No mask.)\n",
        "logits = model(model_input)\n",
        "serving_output = {\n",
        "    # SavedModel signature outputs are keyed by the name of the last layer.\n",
        "    # Restored Keras model outputs preserve the dict seen here.\n",
        "    # This code puts the same key into both places.\n",
        "    \"logits\": tf.keras.layers.Layer(name=\"logits\")(logits),\n",
        "    \"probabilities\": tf.keras.layers.Layer(name=\"probabilities\")(\n",
        "        tf.math.softmax(logits))}\n",
        "serving_model = tf.keras.Model(serving_input, serving_output)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 27862,
          "status": "ok",
          "timestamp": 1711613916561,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "8sauKPKlL8nb",
        "outputId": "02db1b0a-7a71-4eaa-b044-5e0c042a62c9"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "rm: cannot remove '/tmp/exported_keras_model': No such file or directory\n"
          ]
        },
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.\n",
            "/usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/nested_structure_coder.py:458: UserWarning: Encoding a StructuredValue with type tensorflow_gnn.GraphTensorSpec; loading this StructuredValue will require that this type be imported and registered.\n",
            "  warnings.warn(\"Encoding a StructuredValue with type %s; loading this \"\n",
            "/usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/nested_structure_coder.py:458: UserWarning: Encoding a StructuredValue with type tensorflow_gnn.ContextSpec.v2; loading this StructuredValue will require that this type be imported and registered.\n",
            "  warnings.warn(\"Encoding a StructuredValue with type %s; loading this \"\n",
            "/usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/nested_structure_coder.py:458: UserWarning: Encoding a StructuredValue with type tensorflow_gnn.NodeSetSpec; loading this StructuredValue will require that this type be imported and registered.\n",
            "  warnings.warn(\"Encoding a StructuredValue with type %s; loading this \"\n",
            "/usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/nested_structure_coder.py:458: UserWarning: Encoding a StructuredValue with type tensorflow_gnn.EdgeSetSpec; loading this StructuredValue will require that this type be imported and registered.\n",
            "  warnings.warn(\"Encoding a StructuredValue with type %s; loading this \"\n",
            "/usr/local/lib/python3.10/dist-packages/tensorflow/python/saved_model/nested_structure_coder.py:458: UserWarning: Encoding a StructuredValue with type tensorflow_gnn.AdjacencySpec; loading this StructuredValue will require that this type be imported and registered.\n",
            "  warnings.warn(\"Encoding a StructuredValue with type %s; loading this \"\n"
          ]
        }
      ],
      "source": [
        "export_path = \"/tmp/exported_keras_model\"\n",
        "!rm -r {export_path}\n",
        "serving_model.save(export_path, include_optimizer=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hYh7iuDDMBgp"
      },
      "source": [
        "Note that any resources for preprocessing (like lookup tables) have to be attached to the saved object. Our current code has none yet, but composing Keras models as above will take care of it.\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XcW_YAF2MQdJ"
      },
      "source": [
        "For demonstration, let's call the exported model on the example dataset from above, but without labels. We load it as a SavedModel, like TF Serving would. (Using `tf.keras.models.load_model()` instead would rebuild the original Keras layers; see TensorFlow's [Save and load models](https://www.tensorflow.org/tutorials/keras/save_and_load) tutorial for more.)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "o9u1Mgqz9Qec"
      },
      "outputs": [],
      "source": [
        "restored_model = tf.saved_model.load(export_path)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "executionInfo": {
          "elapsed": 720,
          "status": "ok",
          "timestamp": 1711613924501,
          "user": {
            "displayName": "Arno Eigenwillig",
            "userId": "11315922694496346185"
          },
          "user_tz": -60
        },
        "id": "DAxsZQjQ9WwR",
        "outputId": "0b6f4b26-3c9b-4b39-c03c-b1ab5ff7953f"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "The predicted class for input 0 is 258 with predicted probability 0.08846\n",
            "The predicted class for input 1 is   1 with predicted probability 0.4106\n",
            "The predicted class for input 2 is   1 with predicted probability 0.4112\n",
            "The predicted class for input 3 is 134 with predicted probability 0.1854\n",
            "The predicted class for input 4 is 258 with predicted probability 0.1949\n",
            "The predicted class for input 5 is   1 with predicted probability 0.3494\n",
            "The predicted class for input 6 is 134 with predicted probability 0.1438\n",
            "The predicted class for input 7 is 112 with predicted probability 0.04063\n",
            "The predicted class for input 8 is 112 with predicted probability 0.02867\n",
            "The predicted class for input 9 is 283 with predicted probability 0.09065\n"
          ]
        }
      ],
      "source": [
        "def _clean_example_for_serving(serialized):\n",
        "  example = tf.train.Example.FromString(serialized)\n",
        "  example.features.feature.pop(\"nodes/paper.labels\")\n",
        "  return example.SerializeToString()\n",
        "\n",
        "num_examples = 10\n",
        "clean_examples = [_clean_example_for_serving(gt.numpy()) for gt in itertools.islice(demo_ds, num_examples)]\n",
        "\n",
        "clean_ds = tf.data.Dataset.from_tensor_slices(clean_examples)\n",
        "for serialized_example in clean_ds.batch(num_examples).take(1):\n",
        "  outputs = restored_model.signatures[\"serving_default\"](\n",
        "      examples=serialized_example)\n",
        "  probabilities = outputs[\"probabilities\"].numpy()\n",
        "  classes = probabilities.argmax(axis=1)\n",
        "  for i, c in enumerate(classes):\n",
        "    print(f\"The predicted class for input {i} is {c:3} \"\n",
        "          f\"with predicted probability {probabilities[i, c]:.4}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Sae7WfNf44mJ"
      },
      "source": [
        "Recall that this is not a fully trained model, so the results will be inaccurate, unless you changed `epoch_divisor` above."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mtM1CfDO0wBF"
      },
      "source": [
        "## Next steps"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "s89Zwsuk0yXE"
      },
      "source": [
        "This tutorial has shown how to solve a node classification problem in a large graph with TF-GNN, using\n",
        "  * the graph sampler tool to obtain manageable inputs for each classification target,\n",
        "  * a TensorFlow model built in Colab with tfgnn.keras.layers.\n",
        "\n",
        "The [Data Preparation and Sampling](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/data_prep.md) guide describes how you can create training data for other datasets.\n",
        "\n",
        "The [Modeling](https://github.com/tensorflow/gnn/blob/main/tensorflow_gnn/docs/guide/gnn_modeling.md) guide explains how to use other GNN architectures or write your own."
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [
        "ScitaPqhKtuW"
      ],
      "gpuType": "T4",
      "last_runtime": {
        "build_target": "//research/colab/notebook:notebook_backend_py3",
        "kind": "private"
      },
      "provenance": [
        {
          "file_id": "1CYTse8C94LiKNw12_VRsqFAY9eUK5cKC",
          "timestamp": 1711619616130
        },
        {
          "file_id": "https://github.com/tensorflow/gnn/blob/main/examples/notebooks/ogbn_mag_indepth.ipynb",
          "timestamp": 1711612125828
        }
      ]
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
